[R] regexp problem

David Winsemius dwinsemius at comcast.net
Fri Jul 1 17:21:19 CEST 2011


On Jul 1, 2011, at 11:02 AM, Rainer M Krug wrote:

> Hi
>
> I have a question concerning regexp - I want to select with grep all
> character strings which contain the numbers 11:20 (code below).
>
> At the moment I am using [], but that obviously does not work, as it  
> matches
> each element in the []. Is there a way to specify that the regexp  
> should
> match 11, but not 1?
>
> Here is the code code:
>
> x <- paste("suff", 1:40, "pref", sep="_")
> x
> ##  [1] "suff_1_pref"  "suff_2_pref"  "suff_3_pref"  "suff_4_pref"
> "suff_5_pref"
> ##  [6] "suff_6_pref"  "suff_7_pref"  "suff_8_pref"  "suff_9_pref"
> "suff_10_pref"
> ## [11] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"
> "suff_15_pref"
> ## [16] "suff_16_pref" "suff_17_pref" "suff_18_pref" "suff_19_pref"
> "suff_20_pref"
> ## [21] "suff_21_pref" "suff_22_pref" "suff_23_pref" "suff_24_pref"
> "suff_25_pref"
> ## [26] "suff_26_pref" "suff_27_pref" "suff_28_pref" "suff_29_pref"
> "suff_30_pref"
> ## [31] "suff_31_pref" "suff_32_pref" "suff_33_pref" "suff_34_pref"
> "suff_35_pref"
> ## [36] "suff_36_pref" "suff_37_pref" "suff_38_pref" "suff_39_pref"
> "suff_40_pref"
>

 > grep("suff_1[1-9]|suff_20", x, value=TRUE)
  [1] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"  
"suff_15_pref" "suff_16_pref"
  [7] "suff_17_pref" "suff_18_pref" "suff_19_pref" "suff_20_pref"

> i <- paste(11:20, collapse=",")
> i
> ## [1] "11,12,13,14,15,16,17,18,19,20"

That does not look right. You now have a single element with lots of  
commas.
>
> grep(paste("suff_[", i, "]", sep=""), x, value=TRUE)
> ##  [1] "suff_1_pref"  "suff_2_pref"  "suff_3_pref"  "suff_4_pref"
> "suff_5_pref"
> ##  [6] "suff_6_pref"  "suff_7_pref"  "suff_8_pref"  "suff_9_pref"
> "suff_10_pref"
> ## [11] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"
> "suff_15_pref"
> ## [16] "suff_16_pref" "suff_17_pref" "suff_18_pref" "suff_19_pref"
> "suff_20_pref"
> ## [21] "suff_21_pref" "suff_22_pref" "suff_23_pref" "suff_24_pref"
> "suff_25_pref"
> ## [26] "suff_26_pref" "suff_27_pref" "suff_28_pref" "suff_29_pref"
> "suff_30_pref"
> ## [31] "suff_31_pref" "suff_32_pref" "suff_33_pref" "suff_34_pref"
> "suff_35_pref"
> ## [36] "suff_36_pref" "suff_37_pref" "suff_38_pref" "suff_39_pref"
> "suff_40_pref"
>
The list of values in an [ ] expression is not delimited by commas.  
You are matching on the first letter following the underscore whenever  
any character in the "i" string is present (including commas).

 > x[40] <- 'suff_,zz_pref'
 > grep(paste("suff_[", i, "]", sep=""), x, value=TRUE)
# x[40] matches

> ## But I would like to have
> ## [1] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"
> "suff_15_pref"
> ## [6] "suff_16_pref" "suff_17_pref" "suff_18_pref" "suff_19_pref"
> "suff_20_pref"
>
> Version and platform info:
>
>> version
>               _
> platform       i686-pc-linux-gnu
> arch           i686
> os             linux-gnu
> system         i686, linux-gnu
> status
> major          2
> minor          13.0
> year           2011
> month          04
> day            13
> svn rev        55427
> language       R
> version.string R version 2.13.0 (2011-04-13)
>
>> sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: i686-pc-linux-gnu (32-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C
> [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8
> [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8
> [7] LC_PAPER=en_GB.utf8       LC_NAME=C
> [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] reshape_0.8.4   plyr_1.5.2      tgp_2.4-2       lhs_0.5
> [5] RSQLite_0.9-4   DBI_0.2-5       date_1.2-29     simecol_0.7-2
> [9] lattice_0.19-26 deSolve_1.10-2
>
> loaded via a namespace (and not attached):
> [1] grid_2.13.0  tools_2.13.0
>>
>
> Thanks in advance,
>
> Rainer
>
> -- 
> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation  
> Biology,
> UCT), Dipl. Phys. (Germany)
>
> Centre of Excellence for Invasion Biology
> Stellenbosch University
> South Africa
>
> Tel :       +33 - (0)9 53 10 27 44
> Cell:       +33 - (0)6 85 62 59 98
> Fax (F):       +33 - (0)9 58 10 27 44
>
> Fax (D):    +49 - (0)3 21 21 25 22 44
>
> email:      Rainer at krugs.de
>
> Skype:      RMkrug
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list