[R] Rank-based p-value on large dataset
Sean Davis
sdavis2 at mail.nih.gov
Thu Mar 3 23:22:29 CET 2005
I have a fairly simple problem--I have about 80,000 values (call them
y) that I am using as an empirical distribution and I want to find the
p-value (never mind the multiple testing issues here, for the time
being) of 130,000 points (call them x) from the empirical distribution.
I typically do that (for one-sided test) something like
loop over i in x
p.val[i] = sum(y>x[i])/length(y)
and repeat for all i. However, length(x) is large here as is
length(y), so this process takes quite a long time. Any suggestions?
Thanks,
Sean
More information about the R-help
mailing list