[R] Document Term Matrix will not maintain decimal places of numbers or capture all terms
Will Ebert
willebert34 at gmail.com
Tue Mar 14 18:46:59 CET 2017
Before I updated my version of RStudio (1.0.136), everything worked great.
With the update something has changed with Document Term Matrix in the 'tm'
package. I want to create a dtm, but with numbers. For instance if I have a
.csv with one column as shown below:
x1.0111.21123.35212.11
I want the column names in my term matrix to look like this:
1.01 11.21 123.35 212.111 0 0 00 1 0 00 0
1 00 0 0 1
But instead it looks like this:
123 2120 00 01 00 1
Here's the code that used to work:
corpus = Corpus(VectorSource(x))
dtm = DocumentTermMatrix(corpus)
dtm_df = as.data.frame(as.matrix(dtm))
I have tried uninstalling everything and reinstalling, tried older versions
(Studio 0.99.489 & R 3.3.1), but I get the same results. I ask others to
test it out and it works for them. Also, I had someone download R, Rtools,
and RStudio to test this and they got the same results I did. I have no
idea what has happened and would greatly appreciate help on this matter as
it is extremely urgent.
Thanks in advance
Will
[[alternative HTML version deleted]]
More information about the R-help
mailing list