[R] findAssocs Heatmap in R

Elahe chalabi ch@l@bi@el@he @ending from y@hoo@de
Thu Oct 25 11:24:07 CEST 2018


Hi all,


I have a document term matrix and I would like to have a heatmap (geom_tile) for 20 most associated words to a specific word in it. Here is my dtm:


 corpus=Corpus(VectorSource(data$Message))
 corpus=tm_map(corpus,tolower)
corpus=tm_map(corpus,removePunctuation)
corpus=tm_map(corpus,removeWords,c(stopwords("english")))
corpus=tm_map(corpus,stemDocument,"english")
frequencies=DocumentTermMatrix(corpus)
frequencies=removeSparseTerms(frequencies,0.995)
frequencies
<<DocumentTermMatrix (documents: 16630, terms: 399)>>
Non-/sparse entries: 118557/6516813
Sparsity           : 98%
Maximal term length: 43
Weighting          : term frequency (tf)

and the word I'm looking for the 20 most associated words in dtm for it:

word=c("problem")
corr <- c(0.7, 0.75, 0.1)
my_assocs <- findAssocs(frequencies, word,corr)

my problem is in ggplot line containing only 20 most associated words. How should I bring these to ggplot?

Thanks for any help.
Elahe



More information about the R-help mailing list