[R] how to load only lines that start with a particular symbol

Gabor Grothendieck ggrothendieck at gmail.com
Wed Sep 16 02:50:54 CEST 2009

In the Windows cmd shell ^ means escape the next character
so try this (assuming the data you posted
is in genetest.dat in the current directory):

> readLines(pipe("findstr/b ^> genetest.dat"))
[1] ">gene A;....." ">gene B;...."

and on UNIX replace "..." with the corresponding grep command
making sure you appropriately escape the > depending on the
shell you use.

On Tue, Sep 15, 2009 at 4:59 PM, J Chen <jiaxuan.chen at mdc-berlin.de> wrote:
> Dear all,
> I have DNA sequence data which are fasta-formatted as
>>gene A;.....
>>gene B;....
> I want to load only the lines that start with ">" where the annotation
> information for the gene is contained. In principle, I can remove the
> sequences before loading or after loading all the lines. I just wonder if
> there's a way to load only lines with a particular pattern. The skip
> argument in read.table() doesn't work for my purpose.
> Thanks in advance,
> Jimmy
> --
> View this message in context: http://www.nabble.com/how-to-load-only-lines-that-start-with-a-particular-symbol-tp25461693p25461693.html
> Sent from the R help mailing list archive at Nabble.com.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list