[R] Mining non-english text

Loris Bennett loris.bennett at fu-berlin.de
Wed Mar 4 08:52:05 CET 2015

saikiran putta <putta.saikiran1994 at gmail.com> writes:

> I am new to R programming and trying to mine this pdf file
> This pdf file is in
> non-English language and I'm not able to figure out how to proceed. And,
> I'm not even sure how to extract information from a PDF file, so please
> help!
> 	[[alternative HTML version deleted]]

Nothing to do with R, but the command-line program pdftotxt might help
you to get going and is available for Linux and, apparently, for
Windows.  It can deal with various encodings.



This signature is currently under construction.

More information about the R-help mailing list