[R] vectorization condition counting

William Dunlap wdunlap at tibco.com
Sat Aug 11 02:02:28 CEST 2012


Your sum(tag_id==tag_id[i])==1, meaning tag_id[i] is the only entry with its
value, may be vectorized by the sneaky idiom
   !(duplicated(tag_id,fromLast=FALSE) | duplicated(tag_id,fromLast=TRUE)

Hence f0() (with your code in a loop) and f1() are equivalent:
f0 <- function (tags) {
    for (i in seq_len(nrow(tags))) {
        if (sum(tags$tag_id == tags$tag_id[i]) == 1 & tags$lgth[i] < 300) {
            tags$stage[i] <- "J"
        }
    }
    tags
}
f1 <-function (tags) {
    needsChanging <- with(tags, !(duplicated(tag_id, fromLast = FALSE) |
        duplicated(tag_id, fromLast = TRUE)) & lgth < 300)
    tags$stage[needsChanging] <- "J"
    tags
}

E.g.,
> someTags <- data.frame(tag_id = c(1, 2, 2, 3, 4, 5, 6, 6), lgth = 50*(1:8), stage=factor(rep(".",8), levels=c(".","J")))
> all.equal(f0(someTags), f1(someTags))
[1] TRUE
> f1(someTags)
  tag_id lgth stage
1      1   50     J
2      2  100     .
3      2  150     .
4      3  200     J
5      4  250     J
6      5  300     .
7      6  350     .
8      6  400     .

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Guillaume2883
> Sent: Friday, August 10, 2012 3:47 PM
> To: r-help at r-project.org
> Subject: [R] vectorization condition counting
> 
> Hi all,
> 
> I am working on a really big dataset and I would like to vectorize a
> condition in a if loop to improve speed.
> 
> the original loop with the condition is currently writen as follow:
> 
> if(sum(as.integer(tags$tag_id==tags$tag_id[i]))==1&tags$lgth[i]<300){
> 
>      tags$stage[i]<-"J"
> 
>    }
> 
> Do you have some ideas ? I was unable to do it correctly
> Thanking you in advance for your help
> 
> Guillaume
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/vectorization-condition-
> counting-tp4639992.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list