[R] Sampling distribution (PDF & CDF) of correlation
Mike Lawrence
Mike.Lawrence at DAL.CA
Thu Jul 17 17:29:20 CEST 2008
Hi all,
I'm looking for an analytic method to obtain the PDF & CDF of the
sampling distribution of a given correlation (rho) at a given sample
size (N).
I've attached code describing a monte carlo method of achieving this,
and while it is relatively fast, an analytic solution would obviously
be optimal.
get.cors <- function(i, x, y, N){
end=i*N
.Internal(cor(x[(end-N+1):end] ,y[(end-N+1):end] ,TRUE ,FALSE ))
}
get.r.dist <- function(N, rho, it){
Sigma=matrix(c(1,rho,rho,1),2,2)
eS = eigen(Sigma, symmetric = TRUE, EISPACK = TRUE)
ev = eS$values
fact = eS$vectors %*% diag(sqrt(pmax(ev, 0)), 2)
Z = rnorm(2 * N * it)
dim(Z) = c(2, N * it)
Z = t(fact %*% Z)
x = Z[, 1]
y = Z[, 2]
r = sapply(1:it ,get.cors,x, y, N)
return(r)
}
#Run 1e3 monte carlo iterations, where each obtains the correlation
# of 10 pairs of observations from a bivariate normal distribution with
# a true correlation of .5. Returns 1e3 values for the observed
correlation
mc.rs = get.r.dist( N=10 , rho=.5 , it=1e3 )
#plot the PDF & CDF
par(mfrow=c(1,2))
hist(mc.rs,prob=T,xlab='Observed correlation')
probs = seq(0,1,.01)
plot(quantile(mc.rs,probs=probs),probs,type='l',xlab='Observed
correlation',ylab='Cumulative probability')
--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University
www.memetic.ca
"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein
More information about the R-help
mailing list