[R] regex in R

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Mar 22 07:47:37 CET 2004


On Sun, 21 Mar 2004, Fred J. wrote:

> I could use some help here with trying to use perl
> stype regex to extract the first group of letters
> before a ( . )
> so if I have a sting AACEE.adiid and wanting AACEE 
> i <- "AACEE.adiid"
> grep(".+\..?+",i,perl=T)
> I must be doing somthing wrong but don't know what it
> is?

First, see ?regexp, which says

     Patterns are described here as they would be printed by 'cat': do
     remember that backslashes need to be doubled in entering R
     character strings from the keyboard.

so you need to double \.

Second, your pattern is wrong.  You wanted the first ., so use

".+?\\..*"

in perl style, or just "[^.]+\\..+" in any style.

Second, grep tells you whether or not the pattern occurred.  If you want 
to extract it, you need to use sub and sub-expressions, as in

sub("(.+?)(\\..+)", "\\1", i, perl=TRUE)
sub("([^.]+)(\\..+)", "\\1", i)


Please do read the help pages before posting: they have the information 
and relevant examples.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list