[R] Partial correlations and p-values
dadrivr
dadrivr at gmail.com
Wed Dec 2 02:24:17 CET 2009
I am trying to calculate a partial correlation and p-values. Unfortunately,
the results in R are different than what SPSS gives.
Here is an example in R (calculating the partial correlation of x and y,
controlling for z1 and z2):
x <- c(1,20,14,30,9,4,8)
y <- c(5,6,7,9,NA,10,6)
z1 <- c(13,8,16,14,26,13,20)
z2 <- c(12,NA,2,5,8,16,13)
fmx <- lm(x ~ z1 + z2, na.action = na.exclude)
fmy <- lm(y ~ z1 + z2, na.action = na.exclude)
yres <- resid(fmy)
xres <- resid(fmx)
cor(xres, yres, use = "p")
ct <- cor.test(xres, yres)
ct$estimate
ct$p.value
R give me:
r = .65, p = .23
However, SPSS calculates:
r = .46, p = .70
I think something may be different with R's handling of missing data, as
when I replace the NA's with values, R and SPSS give the same r-values,
albeit different p-values still. I am doing pairwise case exclusion in both
R and SPSS. Any ideas why I'm getting different values? Is something wrong
with my formula in R? Any help would be greatly appreciated. Thanks!
Peter Ehlers wrote:
>
>
> dadrivr wrote:
>> The variables have the same length, but with different numbers of missing
>> values (NA). As a result, the residuals calculations (xres & yres) have
>> different lengths, and I cannot compute the correlation between the two
>> (error of incompatible dimensions - see example below). Is there a way,
>> when calculating residuals, to leave the NAs in the residual calculation
>> and
>> output? Thanks!
>>
>> x <- c(1,20,14,NA,9)
>> y <- c(5,6,7,9,10)
>> z <- c(13,NA,16,14,NA)
>> xres <- residuals(lm(x ~ z))
>> yres <- residuals(lm(y ~ z))
>> cor(xres, yres)
>> ct <- cor.test(xres, yres)
>> ct$estimate
>> ct$p.value
>>
>
> Well, your example above just uses two points for the x on z
> regression and that gives zero-residuals. Let's use
> something a bit more realistic:
>
> set.seed(123)
> x <- rnorm(5)
> y <- rnorm(5)
> z <- rnorm(5)
> x[sample(5,1)] <- NA
> z[sample(5,1)] <- NA
> fmx <- lm(x ~ z, na.action = na.exclude)
> fmy <- lm(y ~ z, na.action = na.exclude)
> yres <- resid(fmy)
> xres <- resid(fmx)
> cor(xres, yres, use = "p") # -0.95978
> ct <- cor.test(xres, yres)
> ct$estimate # -0.95978
> ct$p.value # 0.04021994
>
> Check out the 'na.exclude' action in lm(): it preserves
> the length of the residual vector. Then the 'use='
> argument to cor() uses pairwise complete observations.
> cor.test() will do that automatically.
>
> -Peter Ehlers
>
>>
>> Ista Zahn wrote:
>>> 1) Think about what you did wrong. It doesn't make sense to do
>>> correlation/regression with variables of different lengths. You can
>>> have missing values in one or more variables, if that's what you mean.
>>> Just code them NA.
>>>
>>> 2) Just add in the predictors, e.g.
>>> residuals(lm(y ~ z1 + z2))
>>>
>>> -Ista
>>>
>>> On Wed, Nov 11, 2009 at 10:34 PM, dadrivr <dadrivr at gmail.com> wrote:
>>>> Awesome, that's what I was looking for. Â I have two additional
>>>> questions:
>>>> (1)
>>>> What can I do if the variables are of different lengths? (2) How do I
>>>> update
>>>> the formula if I want to control for more than one variable.
>>>>
>>>> Let's take the following example:
>>>> x <- c(1,20,14,7,9)
>>>> y <- c(5,6,7,9,10,11)
>>>> z <- c(13,27,16,5,4,17,20)
>>>> a <- c(4,6,7,1)
>>>>
>>>> xres <- residuals(lm(x ~ z))
>>>> yres <- residuals(lm(y ~ z))
>>>> cor(xres, yres)
>>>> ct <- cor.test(xres, yres)
>>>> ct$estimate
>>>> ct$p.value
>>>>
>>>> How do I update the above formula to:
>>>> (1) take into account that the variables are of different lengths? Â I
>>>> get
>>>> an
>>>> error when calculating the residuals.
>>>> (2) control for z and a (i.e., more than one variable)?
>>>>
>>>> Thanks so much for your help.
>>>>
>>>>
>>>> Peter Ehlers wrote:
>>>>>
>>>>> dadrivr wrote:
>>>>>> I'm trying to write code to calculate partial correlations (along
>>>>>> with
>>>>>> p-values). Â I'm new to R, and I don't know how to do this. Â I have
>>>>>> searched
>>>>>> and come across different functions, but I haven't been able to get
>>>>>> any
>>>>>> of
>>>>>> them to work (for example, pcor and pcor.test from the ggm package).
>>>>>>
>>>>>> In the following example, I am trying to compute the correlation
>>>>>> between
>>>>>> x
>>>>>> and y, while controlling for z (partial correlation):
>>>>>>
>>>>>> x <- c(1,20,14,7,9)
>>>>>> y <- c(5,6,7,9,10)
>>>>>> z <- c(13,27,16,5,4)
>>>>>>
>>>>>> What function can I append to this to find this partial correlation?
>>>>>> Many
>>>>>> thanks!
>>>>> I'm not sure what you need, but does this give you what
>>>>> you want:
>>>>>
>>>>> xres <- residuals(lm(x ~ z))
>>>>> yres <- residuals(lm(y ~ z))
>>>>> cor(xres, yres)
>>>>> # [1] 0.9778857
>>>>>
>>>>> or
>>>>>
>>>>> ct <- cor.test(xres, yres)
>>>>> ct$estimate Â # 0.9978857
>>>>> ct$p.value Â # 0.003934582
>>>>>
>>>>> Â -Peter Ehlers
>>>>>
>>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Partial-correlations-and-p-values-tp26308463p26312873.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>> --
>>> Ista Zahn
>>> Graduate student
>>> University of Rochester
>>> Department of Clinical and Social Psychology
>>> http://yourpsyche.org
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context: http://n4.nabble.com/Partial-correlations-and-p-values-tp908641p932659.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list