[R] Position in a vector of the last value > n - *SOLVED*

Thaden, John J ThadenJohnJ at uams.edu
Sat Jul 12 23:00:05 CEST 2008


Yes, your version (func2) is quick, quickest for longer vectors:
> m <- matrix(rexp(6e6,rate=0.05), nrow=50000) # 120 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20)  max(which(v>cut))
> func2 <- function(v,cut=20) {
+    x <- which(v>cut)
+    x[length(x)]
+ }
> func3 <- function(v,cut=20) tail(which(v>cut), 1)
> system.time(apply(m, 2, func1))
   user  system elapsed 
   0.58    0.01    0.59 
> system.time(apply(m, 2, func2))
   user  system elapsed 
   0.48    0.04    0.53 
> system.time(apply(m, 2, func3))
   user  system elapsed 
   0.55    0.00    0.56
-John Thaden

-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com] 
Sent: Saturday, July 12, 2008 6:56 AM
To: Thaden, John J
Cc: r-help at r-project.org
Subject: Re: [R] Position in a vector of the last value > n - *SOLVED*

A slight modification gives the equivalent results instead of using 'tail'

> m <- matrix(rexp(6e6,rate=0.05), nrow=600) # 5,000 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20)  max(which(v>20))
> func2 <- function(v,cut=20) {
+     x <- which(v>20)
+     x[length(x)]
+ }
> system.time(apply(m, 2, func1))
   user  system elapsed
   1.33    0.05    1.47
> #   user  system elapsed
> #   0.40    0.02    0.42
> system.time(apply(m, 2, func2))
   user  system elapsed
   1.31    0.08    1.44
> #   user  system elapsed
> #   0.70    0.05    0.75
>

Here is another view using Rprof on the first version.  You can see
that 'tail' takes a fair amount of time; accounts for the differences
in timing:

/cygdrive/c: perl perf/bin/readrprof.pl tempxx.txt
  0   2.7 root
  1.    1.8 system.time
  2. .    1.7 eval
  3. . .    1.7 eval
  4. . . .    1.7 apply
  5. . . . |    1.5 FUN
  6. . . . | .    0.8 tail
  7. . . . | . .    0.5 which
  8. . . . | . . .    0.1 &
  8. . . . | . . .    0.0 >
  8. . . . | . . .    0.0 !
  7. . . . | . .    0.3 tail.default
  8. . . . | . . .    0.2 stopifnot
  9. . . . | . . . .    0.1 eval
  9. . . . | . . . .    0.0 match.call
  9. . . . | . . . .    0.0 any
  6. . . . | .    0.5 which
  7. . . . | . .    0.1 &
  7. . . . | . .    0.1 >
  7. . . . | . .    0.0 names<-
  7. . . . | . .    0.0 is.na
  5. . . . |    0.1 aperm
  5. . . . |    0.0 unlist
  6. . . . | .    0.0 lapply
  5. . . . |    0.0 is.null
  2. .    0.1 gc
  1.    0.8 matrix
  2. .    0.7 as.vector
  3. . .    0.6 rexp
  1.    0.1 <
/cygdrive/c:


On Fri, Jul 11, 2008 at 12:23 PM, Thaden, John J <ThadenJohnJ at uams.edu>
wrote:
> I had written asking for a simple way to extract the
> Index of the last value in a vector greater than some
> cutoff, e.g., the index, 6, for a cutoff of 20 and this
> example vector:
>
> v <- c(20, 134, 45, 20, 24, 500, 20, 20, 20)
>
> Thank you, Alain Guillet, for this simple solution sent
> to me offlist:
>
> max(which(v > 20)
>
> Also, thank you Lisa Readdy for a lengthier solution.
>
> Other offerings yielded the value instead of the index
> (the phrasing of my question apparently was misleading):
>
> v[max(which(v > 20))]  (Henrique Dallazuanna)
>
> tail(v[v>20],1)        (Jim Holtman)
>
> Jim's use of tail() suggests a variant to Alain's
> solution
>
> tail(which(v > 20), 1)
>
> This is faster than the max() version with long vectors,
> but, to my surprise, slower (on my WinXP Lenovo T61 laptop)
> in a rough mockup of my column-wise apply() usage:
>
> m <- matrix(rexp(3e6,rate=0.05), nrow=600) # 5,000 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20)  max(which(v>20))
> func2 <- function(v,cut=20) tail(which(v>20),1)
> system.time(apply(m, 2, func1))
> #   user  system elapsed
> #   0.40    0.02    0.42
> system.time(apply(m, 2, func2))
> #   user  system elapsed
> #   0.70    0.05    0.75
>
> Thank you again, Alain and others.
> John
>
> ----------------
>
> On Thu, Jul 10, 2008 at 9:41 AM, John Thaden wrote:
>> This shouldn't be hard, but it's just not
>> coming to me:
>> Given a vector, e.g.,
>> v <- c(20, 134, 45, 20, 24, 500, 20, 20, 20)
>> how can I get the index of the last value in
>> the vector that has a value greater than n, in
>> the example, with n > 20?  I'm looking for
>> an efficient function I can use on very large
>> matrices, as the FUN argument in the apply()
>> command.
>
> Confidentiality Notice: This e-mail message, including a...{{dropped:8}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Confidentiality Notice: This e-mail message, including a...{{dropped:8}}



More information about the R-help mailing list