[R] Calculating distance between words in string
David Winsemius
dwinsemius at comcast.net
Fri Nov 6 17:56:53 CET 2015
> On Nov 6, 2015, at 3:28 AM, Karl <josip.2000 at gmail.com> wrote:
>
> Hi All,
>
> Using R for text processing is quite new to me, while I have found a lot of
> useful functions and I'm beginning to learn regex, I need help with the
> following task. How do I calculate the distance between words?
>
> That is, given a specific keyword, I need to assign labels to the other
> words based on the distance (number of words) to this keyword.
>
> For example, if the keyword is "amet" and the string of words is
strng <- "Lorem ipsum dolor sit amet, consectetur adipiscing elit.”
> -> "dolor" would get a value of -2
> -> "elit" would get a value of 3
words <- unlist(strsplit(strng, "\\W"))
words[words != ""]
#[1] "Lorem" "ipsum" "dolor" "sit"
#[5] "amet" "consectetur" "adipiscing" "elit"
real <- words[words != “"]
which(real == "amet")
#[1] 5
length(real)
#[1] 8
vec <- 1:length(real) - which(real == "amet")
names(vec) <- real
vec["dolor"]
#dolor
# -2
> #
> If the sentence contains more than one instance of the keyword, I need
> values for each instance. Moreover, one can assume that I can split my data
> into sentences, so there is no need to search and recognize sentences (this
> is a separate problem).
>
> Thank you!
>
> Best regards,
> Jay
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list