[R] library/function to compare two phrases?
Brian Feeny
bfeeny at mac.com
Sun Nov 18 00:00:57 CET 2012
I am looking for a library/function in R that can compare two phrases and give me a score, or somehow classify them as correct as possible.
The "phrases" are obfuscated/messy. I am not concerned about which is "correct" (for example spell checking), I am only concerned in grouping them
so that I know they are the closest match.
Example:
I have ROW1 and ROW2 like so:
ROW1 ROW2
hamburger helper bigmc heartkcatta
chicken nuggets chicke, nuggets, jss
bigmac heartattack some sombody somehwere
somebody somehwere repleh regrubmah
I am looking for something that can tell me that the best match for hamburger helper is repleh regrubmah, and the same for each other row.
So my goal is to write a program that foreach phrase in ROW1 runs this function against ROW2 and gives me the phrase that scored best.
I have read over much of the NLP packages at http://cran.r-project.org/web/views/NaturalLanguageProcessing.html
I thought lsa might be a good fit, but I am not sure. I have limited time, so I am hoping someone can point me in a direction of what I am looking for.
I have been searching for "text classifiers", perhaps this problem is referred to as something else.
Brian
More information about the R-help
mailing list