[R] Rank-based p-value on large dataset
Huntsinger, Reid
reid_huntsinger at merck.com
Thu Mar 3 23:38:50 CET 2005
When you say the 130,000 points are from the empirical distribution, how did
you get them? Is each one really one of the values of y? If you sorted y
first, would you know which one (ie which index) each x is? (Sorting 80,000
elements took essentially no time at all on my sub-gigahertz Pentium III.)
But maybe that's not an option... more details would help.
Reid Huntsinger
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Sean Davis
Sent: Thursday, March 03, 2005 5:22 PM
To: r-help
Subject: [R] Rank-based p-value on large dataset
I have a fairly simple problem--I have about 80,000 values (call them
y) that I am using as an empirical distribution and I want to find the
p-value (never mind the multiple testing issues here, for the time
being) of 130,000 points (call them x) from the empirical distribution.
I typically do that (for one-sided test) something like
loop over i in x
p.val[i] = sum(y>x[i])/length(y)
and repeat for all i. However, length(x) is large here as is
length(y), so this process takes quite a long time. Any suggestions?
Thanks,
Sean
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list