[R] Position in a vector of the last value > n - *SOLVED*
Thaden, John J
ThadenJohnJ at uams.edu
Sat Jul 12 23:00:05 CEST 2008
Yes, your version (func2) is quick, quickest for longer vectors:
> m <- matrix(rexp(6e6,rate=0.05), nrow=50000) # 120 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20) max(which(v>cut))
> func2 <- function(v,cut=20) {
+ x <- which(v>cut)
+ x[length(x)]
+ }
> func3 <- function(v,cut=20) tail(which(v>cut), 1)
> system.time(apply(m, 2, func1))
user system elapsed
0.58 0.01 0.59
> system.time(apply(m, 2, func2))
user system elapsed
0.48 0.04 0.53
> system.time(apply(m, 2, func3))
user system elapsed
0.55 0.00 0.56
-John Thaden
-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com]
Sent: Saturday, July 12, 2008 6:56 AM
To: Thaden, John J
Cc: r-help at r-project.org
Subject: Re: [R] Position in a vector of the last value > n - *SOLVED*
A slight modification gives the equivalent results instead of using 'tail'
> m <- matrix(rexp(6e6,rate=0.05), nrow=600) # 5,000 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20) max(which(v>20))
> func2 <- function(v,cut=20) {
+ x <- which(v>20)
+ x[length(x)]
+ }
> system.time(apply(m, 2, func1))
user system elapsed
1.33 0.05 1.47
> # user system elapsed
> # 0.40 0.02 0.42
> system.time(apply(m, 2, func2))
user system elapsed
1.31 0.08 1.44
> # user system elapsed
> # 0.70 0.05 0.75
>
Here is another view using Rprof on the first version. You can see
that 'tail' takes a fair amount of time; accounts for the differences
in timing:
/cygdrive/c: perl perf/bin/readrprof.pl tempxx.txt
0 2.7 root
1. 1.8 system.time
2. . 1.7 eval
3. . . 1.7 eval
4. . . . 1.7 apply
5. . . . | 1.5 FUN
6. . . . | . 0.8 tail
7. . . . | . . 0.5 which
8. . . . | . . . 0.1 &
8. . . . | . . . 0.0 >
8. . . . | . . . 0.0 !
7. . . . | . . 0.3 tail.default
8. . . . | . . . 0.2 stopifnot
9. . . . | . . . . 0.1 eval
9. . . . | . . . . 0.0 match.call
9. . . . | . . . . 0.0 any
6. . . . | . 0.5 which
7. . . . | . . 0.1 &
7. . . . | . . 0.1 >
7. . . . | . . 0.0 names<-
7. . . . | . . 0.0 is.na
5. . . . | 0.1 aperm
5. . . . | 0.0 unlist
6. . . . | . 0.0 lapply
5. . . . | 0.0 is.null
2. . 0.1 gc
1. 0.8 matrix
2. . 0.7 as.vector
3. . . 0.6 rexp
1. 0.1 <
/cygdrive/c:
On Fri, Jul 11, 2008 at 12:23 PM, Thaden, John J <ThadenJohnJ at uams.edu>
wrote:
> I had written asking for a simple way to extract the
> Index of the last value in a vector greater than some
> cutoff, e.g., the index, 6, for a cutoff of 20 and this
> example vector:
>
> v <- c(20, 134, 45, 20, 24, 500, 20, 20, 20)
>
> Thank you, Alain Guillet, for this simple solution sent
> to me offlist:
>
> max(which(v > 20)
>
> Also, thank you Lisa Readdy for a lengthier solution.
>
> Other offerings yielded the value instead of the index
> (the phrasing of my question apparently was misleading):
>
> v[max(which(v > 20))] (Henrique Dallazuanna)
>
> tail(v[v>20],1) (Jim Holtman)
>
> Jim's use of tail() suggests a variant to Alain's
> solution
>
> tail(which(v > 20), 1)
>
> This is faster than the max() version with long vectors,
> but, to my surprise, slower (on my WinXP Lenovo T61 laptop)
> in a rough mockup of my column-wise apply() usage:
>
> m <- matrix(rexp(3e6,rate=0.05), nrow=600) # 5,000 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20) max(which(v>20))
> func2 <- function(v,cut=20) tail(which(v>20),1)
> system.time(apply(m, 2, func1))
> # user system elapsed
> # 0.40 0.02 0.42
> system.time(apply(m, 2, func2))
> # user system elapsed
> # 0.70 0.05 0.75
>
> Thank you again, Alain and others.
> John
>
> ----------------
>
> On Thu, Jul 10, 2008 at 9:41 AM, John Thaden wrote:
>> This shouldn't be hard, but it's just not
>> coming to me:
>> Given a vector, e.g.,
>> v <- c(20, 134, 45, 20, 24, 500, 20, 20, 20)
>> how can I get the index of the last value in
>> the vector that has a value greater than n, in
>> the example, with n > 20? I'm looking for
>> an efficient function I can use on very large
>> matrices, as the FUN argument in the apply()
>> command.
>
> Confidentiality Notice: This e-mail message, including a...{{dropped:8}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
Confidentiality Notice: This e-mail message, including a...{{dropped:8}}
More information about the R-help
mailing list