[R] Memory management in R
Mike Marchywka
marchywka at hotmail.com
Sun Oct 10 16:00:35 CEST 2010
----------------------------------------
> Date: Sun, 10 Oct 2010 15:27:11 +0200
> From: lorenzo.isella at gmail.com
> To: dwinsemius at comcast.net
> CC: r-help at r-project.org
> Subject: Re: [R] Memory management in R
>
>
> > I already offered the Biostrings package. It provides more robust
> > methods for string matching than does grepl. Is there a reason that you
> > choose not to?
> >
>
> Indeed that is the way I should go for and I have installed the package
> after some struggling. Since biostring is a fairly complex package and I
> need only a way to check if a certain string A is a subset of string B,
> do you know the biostring functions to achieve this?
> I see a lot of methods for biological (DNA, RNA) sequences, and they may
> not apply to my series (which are definitely not from biology).
Generally the differences relate to alphabet and "things you may want
to know about them." Unless you are looking for reverse complement
text strings, there will be a lot of stuff you don't need. Offhand,
I'd be looking for things like computational linguistics packages
as you are looking to find patterns or predictability in human readable
character sequences. Now, humans can probably write hairpin-text( look
at what RNA can do LOL) but this is probably not what you care about.
However, as I mentioned earlier, I had to write my own regex compiler ( coincidently
for bio apps ) to get required performance. Your application and understanding
may benefit from things like building dictionaries that aren't really
part of regex and that can easily be done in a few lines of c++ code
using STL containers. To get statistically meaningful samples, you almost
will certainly need faster code.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list