[R] Sorting text docs based on document meta values in tm()

Shad Thomas shad.thomas at glassboxresearch.com
Wed Aug 12 02:44:55 CEST 2009

Hi Kelvin,

I'm new to R and tm myself, however here is a way that you can sort your
corpus.  Please keep in mind that there may be a more efficient approach --
but this will get the job done.

Basically, there are three steps (in pseudo code):
1.  Extract the meta data for Age into a list
2.  Sort the list by Age
3.  Create a new corpus by copying entries from the old corpus in age order

Your actual code would look something like this:
agelist <- lapply(mycorpus, meta, tag = "Age")
agedf <- data.frame(age=as.character(agelist))
agelistorder <- order(agedf$age)
mysortedcorpus <- mycorpus[agelistorder,]

Shad Thomas

Kelvin Lam wrote:
> Hi all,
> I wonder if there's any way to reshuffle the text collection by the
> document meta values.  For instance, if I have 5 documents that correspond
> to the following meta data:
> MetaID Sex Age
> 0         M    38
> 0         M    46
> 0         F     24
> 0         F     49
> 0         F     33
> Can I reorder the text documents based on the ascending order of age? 
> Thank you very much!! 

View this message in context: http://www.nabble.com/Sorting-text-docs-based-on-document-meta-values-in-tm%28%29-tp24907478p24928239.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list