[R] Min Frequency in findFreqTerms
vioravis
vioravis at gmail.com
Wed Nov 9 10:15:00 CET 2011
I am using 'tm' package for text mining. I use the function findFreqTerms to
obtain the frequent words based on their frequency in the term document
matrix.
The following is the example given in the help page of this function:
library("tm")
data("crude")
tdm <- TermDocumentMatrix(crude)
findFreqTerms(tdm, 2, 3)
The first three columns of the document term matrix are shown below:
(bpd) (bpd). (gcc)
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
1 0 0
0 0 0
0 0 0
0 0 0
1 0 0
1 0 0
0 0 1
0 0 0
0 1 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
The first term "(bpd)" has a frequency of 3 whereas the second and third
terms have a frequency of 1 which is below the lowfreq = 2 specified.
Can someone help me whether this is the right way of interpreting this
function??? If so, is there a bug in the package??
Thank you.
Ravi
--
View this message in context: http://r.789695.n4.nabble.com/Min-Frequency-in-findFreqTerms-tp4019143p4019143.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list