[R] DocumentTermMatrix error
    Matevž Pavlič 
    matevz.pavlic at gi-zrmk.si
       
    Sat May 21 13:26:40 CEST 2011
    
    
  
Hi all, 
 
I have tried to create  a DocumentTermMatrix with a tm package, but i get this error :
 
Error in tolower(txt) : 
  invalid input 'PROD Z LAHKO GNETNO MELJNO GLINO, ... in 'utf8towcs'
 
I tried doing this as it is showed in :
http://www.r-project.org/doc/Rnews/Rnews_2008-2.pdf (An Introduction to Text Mining),
 
with this R code :
 
setwd("C:/Users/mpavlic/Desktop/temp")
tekst <- Corpus(DirSource("."))
>Warning message:
>In readLines(y, encoding = x$Encoding) :
>incomplete final line found on './test.txt'
 
meta(tekst, "Heading", "local") <- c("test")
meta(tekst[[1]])
>Available meta data pairs are:
  Author       : 
   DateTimeStamp: 2011-05-21 11:25:21
   Description  : 
   Heading      : test
  ID           : test.txt
  Language     : en
  Origin       :
 
test <- TermDocumentMatrix(tekst)
> Error in tolower(txt) : 
> invalid input 'PROD Z LAHKO GNETNO MELJNO GLINO, ... in 'utf8towcs'
 
 
Attached is a small sample (test.txt) on which i worked.
 
Any help would be appreaciated, 
m
 
 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110521/fe77f990/attachment.txt>
    
    
More information about the R-help
mailing list