[R] ANCOVA for linear regressions without intercept
Yusuke Fukuda
Yusuke.Fukuda at nt.gov.au
Tue Apr 5 07:08:45 CEST 2011
Hi Steve
It worked with my data! I didn't think of combining the categories before doing ANOVA to test for the difference. This is the final answer to my question.
Thank you very much for your time.
Regards,
Yusuke
-----Original Message-----
From: Steven McKinney [mailto:smckinney at bccrc.ca]
Sent: Tuesday, 5 April 2011 12:08 PM
To: Yusuke Fukuda; 'Peter Ehlers'
Cc: r-help at r-project.org
Subject: RE: [R] ANCOVA for linear regressions without intercept
Hi Yusuke,
Does the following get what you are after?
### Make some test data.
> set.seed(123)
> edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", 10)),
+ head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10,
+ body_length = c(c(170:179 + rnorm(10)), c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))
+ )
> edf$sex <- factor(as.character(edf$sex))
> plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190))
> lmf <- lm(body_length ~ head_length * sex, data = edf)
### The full model - do keep an eye on those intercepts and try to ensure they are not far from 0.
> summary(lmf)
Call:
lm(formula = body_length ~ head_length * sex, data = edf)
Residuals:
Min 1Q Median 3Q Max
-2.73783 -0.68133 0.02147 0.50858 2.38931
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.578 25.425 -0.141 0.8893
head_length 12.772 2.054 6.218 2e-06 ***
sexMale 15.122 37.464 0.404 0.6901
sexUnknown 40.308 33.137 1.216 0.2357
head_length:sexMale -4.977 2.438 -2.042 0.0523 .
head_length:sexUnknown -4.971 2.428 -2.047 0.0517 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.384 on 24 degrees of freedom
Multiple R-squared: 0.9802, Adjusted R-squared: 0.9761
F-statistic: 237.7 on 5 and 24 DF, p-value: < 2.2e-16
### Now suppress intercepts. head_length:sex should give interactions (slopes) only.
> lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf)
> summary(lmrf)
Call:
lm(formula = body_length ~ -1 + head_length:sex, data = edf)
Residuals:
Min 1Q Median 3Q Max
-3.02782 -0.61861 -0.01079 0.68785 2.57544
Coefficients:
Estimate Std. Error t value Pr(>|t|)
head_length:sexFemale 12.48253 0.03549 351.8 <2e-16 ***
head_length:sexMale 8.34500 0.02097 398.0 <2e-16 ***
head_length:sexUnknown 10.03844 0.02677 375.0 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.389 on 27 degrees of freedom
Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999
F-statistic: 1.409e+05 on 3 and 27 DF, p-value: < 2.2e-16
### Check the numeric coding of the factor
> with(edf, table(sex, as.numeric(sex)))
sex 1 2 3
Female 10 0 0
Male 0 10 0
Unknown 0 0 10
> abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black
> abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red
> abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green
### If no diff between males and females, then males and females can be combined into one group.
> edf$MvF <- as.character(edf$sex)
> edf$MvF[edf$MvF != "Unknown"] <- "MorF"
> edf$MvF <- factor(edf$MvF)
> with(edf, table(MvF, sex))
sex
MvF Female Male Unknown
MorF 10 10 0
Unknown 0 0 10
> lmr1f <- lm(body_length ~ -1 + head_length : MvF, data = edf)
> summary(lmr1f)
Call:
lm(formula = body_length ~ -1 + head_length:MvF, data = edf)
Residuals:
Min 1Q Median 3Q Max
-23.976 -21.656 0.077 35.899 39.839
Coefficients:
Estimate Std. Error t value Pr(>|t|)
head_length:MvFMorF 9.4156 0.3429 27.46 <2e-16 ***
head_length:MvFUnknown 10.0384 0.5085 19.74 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 26.39 on 28 degrees of freedom
Multiple R-squared: 0.9761, Adjusted R-squared: 0.9744
F-statistic: 571.9 on 2 and 28 DF, p-value: < 2.2e-16
### Test the hypothesis that male and female heights are equivalent
> anova(lmr1f, lmrf)
Analysis of Variance Table
Model 1: body_length ~ -1 + head_length:MvF
Model 2: body_length ~ -1 + head_length:sex
Res.Df RSS Df Sum of Sq F Pr(>F)
1 28 19496.1
2 27 52.1 1 19444 10077 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
### Plot the reduced model regression lines
> abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2)
> abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4)
>
The other two tests can be set up and run similarly. Don't
forget to adjust for multiple comparisons...
HTH
Steve
Steven McKinney, Ph.D.
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre
________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Yusuke Fukuda [Yusuke.Fukuda at nt.gov.au]
Sent: April 4, 2011 5:45 PM
To: 'Peter Ehlers'
Cc: r-help at r-project.org; 'Bert Gunter'
Subject: Re: [R] ANCOVA for linear regressions without intercept
Thank you for your suggestions, stats experts. Much appreciated.
I still haven't got what I wanted but someone suggested looking into contrasts and this is looking worth trying http://finzi.psych.upenn.edu/R/library/gmodels/html/fit.contrast.html
Regards,
Yusuke
-----Original Message-----
From: Peter Ehlers [mailto:ehlers at ucalgary.ca]
Sent: Saturday, 2 April 2011 1:35 AM
To: Yusuke Fukuda
Cc: 'Bert Gunter'; r-help at r-project.org
Subject: Re: [R] ANCOVA for linear regressions without intercept
See inline.
On 2011-03-31 22:22, Yusuke Fukuda wrote:
> Thanks Bert.
>
> I have read "?formula" again and again, and I'm still struggling;
>
>> lm(body_length ~ head_length-1)
>
> This removes intercept from each individual regression (for male, female, unknown).
>
> When they are taken together,
>
>> lm(body_length ~ sex*head_length)
>
> This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts).
>
> If I put
>
>> lm(body_length ~ sex:head_length-1)
>
> This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions.
You probably want:
lm(body_length ~ head_length + sex:head_length-1)
or, in short form:
lm(body_length ~ head_length/sex-1)
You might then compare the model 'without intercepts'
(i.e. with intercepts forced to zero) with a model that
includes intercepts. If the intercepts turn out to
be significantly nonzero, what will you do?
Peter Ehlers
>
> I also tried
>
>> lm(body_length ~ sex*head_length-1)
>> lm(body_length ~ sex*head_length-sex-1)
>
> But none of them worked.
>
> Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not.
>
> Thanks for your help.
>
>
>
> ________________________________________
> From: Bert Gunter [mailto:gunter.berton at gene.com]
> Sent: Friday, 1 April 2011 12:56 AM
> To: Yusuke Fukuda
> Cc: r-help at r-project.org
> Subject: Re: [R] ANCOVA for linear regressions without intercept
>
> If you haven't already received an answer, a careful reading of
>
> ?formula
>
> will provide it.
>
> -- Bert
> On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda<Yusuke.Fukuda at nt.gov.au> wrote:
>
> Hello R experts
>
> I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept).
>
> Is it possible to do ANCOVA for these regressions without intercepts? When I do
>
> summary(lm(body length ~ sex*head length))
>
> this will include the intercepts as below
>
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) -6.49697 1.68497 -3.856 0.000118 ***
> sexMale -9.39340 1.97760 -4.750 2.14e-06 ***
> sexUnknown -1.33791 2.35453 -0.568 0.569927
> head_length 7.12307 0.05503 129.443< 2e-16 ***
> sexMale:head_length 0.31631 0.06246 5.064 4.37e-07 ***
> sexUnknown:head_length 0.19937 0.07022 2.839 0.004556 **
> ---
>
> Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account?
>
> Thanks for help in advance.
>
> Yusuke Fukuda
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list