[R] correlation with missing values.. different answers
arun
smartpink111 at yahoo.com
Mon Apr 14 03:36:04 CEST 2014
Hi,
I think in this case, when you use "na.or.complete", all the NA rows are removed for the full dataset.
cor(swM[-1,1:2])
# Frtlty Agrclt
#Frtlty 1.0000000 0.3920289
#Agrclt 0.3920289 1.0000000
cor(swM[-1,])[1:2,1:2]
#Frtlty Agrclt
#Frtlty 1.0000000 0.3920289
#Agrclt 0.3920289 1.0000000
May be you can try with "pairwise.complete.obs"
cor(swM, use = "pairwise.complete.obs")
# Frtlty Agrclt Exmntn Eductn Cathlc Infn.M
#Frtlty 1.0000000 0.39202893 -0.6531492 -0.66378886 0.4723129 0.41655603
#Agrclt 0.3920289 1.00000000 -0.7150561 -0.65221506 0.4152007 -0.03648427
#Exmntn -0.6531492 -0.71505612 1.0000000 0.69921153 -0.6003402 -0.11433546
#Eductn -0.6637889 -0.65221506 0.6992115 1.00000000 -0.1791334 -0.09932185
#Cathlc 0.4723129 0.41520069 -0.6003402 -0.17913339 1.0000000 0.18503786
#Infn.M 0.4165560 -0.03648427 -0.1143355 -0.09932185 0.1850379 1.00000000
cor(swM[,1:2],use="pairwise.complete.obs")
# Frtlty Agrclt
#Frtlty 1.0000000 0.3920289
#Agrclt 0.3920289 1.0000000
A.K.
On Sunday, April 13, 2014 9:11 PM, Paul Tanger <paul.tanger at colostate.edu> wrote:
Hi,
I can't seem to figure out why this gives me different answers. Probably
something obvious, but I thought they would be the same.
This is an minimal example from the help page of cor() :
> ## swM := "swiss" with 3 "missing"s :
> swM <- swiss
> colnames(swM) <- abbreviate(colnames(swiss), min=6)
> swM[1,2] <- swM[7,3] <- swM[25,5] <- NA # create 3 "missing"
> cor(swM, use = "na.or.complete")
Frtlty Agrclt Exmntn Eductn Cathlc Infn.M
Frtlty 1.0000000 0.37821953 -0.6548306 -0.67421581 0.4772298 0.38781500
Agrclt 0.3782195 1.00000000 -0.7127078 -0.64337782 0.4014837 -0.07168223
Exmntn -0.6548306 -0.71270778 1.0000000 0.69776906 -0.6079436 -0.10710047
Eductn -0.6742158 -0.64337782 0.6977691 1.00000000 -0.1701445 -0.08343279
Cathlc 0.4772298 0.40148365 -0.6079436 -0.17014449 1.0000000 0.17221594
Infn.M 0.3878150 -0.07168223 -0.1071005 -0.08343279 0.1722159 1.00000000
> # why isn't this the same?
> cor(swM[,c(1:2)], use = "na.or.complete")
Frtlty Agrclt
Frtlty 1.0000000 0.3920289
Agrclt 0.3920289 1.0000000
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list