[R] overlapping pattern match (errata 2.0)

james.holtman@convergys.com james.holtman at convergys.com
Sat Mar 29 18:02:54 CET 2003


Another way to find all the multiple occurances of a character in a string
is to use 'rle':

> x.s <- 'aaabbcdeeeffffggiijjysbbddeffghjjjsdkkkkk'
> x <- unlist(strsplit(x.s, NULL))
> x
 [1] "a" "a" "a" "b" "b" "c" "d" "e" "e" "e" "f" "f" "f" "f" "g" "g" "i"
"i" "j"
[20] "j" "y" "s" "b" "b" "d" "d" "e" "f" "f" "g" "h" "j" "j" "j" "s" "d"
"k" "k"
[39] "k" "k" "k"
> rle(x)
Run Length Encoding
  lengths: int [1:21] 3 2 1 1 3 4 2 2 2 1 ...
  values : chr [1:21] "a" "b" "c" "d" "e" "f" "g" "i" "j" "y" "s" "b" "d"
"e" "f" "g" ...
>

When the lengths are >1, the corresponding 'values' are the repeated
characters.




                                                                                                                                           
                      FMGCFMGC                                                                                                             
                      <FMGCFMGC at terra.es>          To:       yfan at diversa.com                                                              
                      Sent by:                     cc:       r-help at stat.math.ethz.ch                                                      
                      r-help-bounces at stat.m        Subject:  Re: [R] overlapping pattern match (errata 2.0)                                
                      ath.ethz.ch                                                                                                          
                                                                                                                                           
                                                                                                                                           
                      03/28/03 17:36                                                                                                       
                                                                                                                                           
                                                                                                                                           




well! excuse me again but...

your.string <- "aaacdf"
nc1 <- nchar(your.string)-1
x <- unlist(strsplit(your.string, NULL)) ######## CORRECT
x2 <- c()
for (i in 1:nc1)
x2 <- c(x2, paste(x[i], x[i+1], sep="")) ######## ERRATA 2
cat("ocurrences of <aa> in <your.string>: ", length(grep("aa", x2)),
sep="", fill=TRUE)

Fran

PD: sorry again

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help




--
"NOTICE:  The information contained in this electronic mail transmission is
intended by Convergys Corporation for the use of the named individual or
entity to which it is directed and may contain information that is
privileged or otherwise confidential.  If you have received this electronic
mail transmission in error, please delete it from your system without
copying or forwarding it, and notify the sender of the error by reply email
or by telephone (collect), so that the sender's address records can be
corrected."



More information about the R-help mailing list