[R] lm without intercept

Fri Feb 18 12:25:36 CET 2011

On Fri, 18 Feb 2011, Jan wrote:

> Hi,
>
> I am not a statistics expert, so I have this question. A linear model
> gives me the following summary:
>
> Call:
> lm(formula = N ~ N_alt)
>
> Residuals:
>    Min      1Q  Median      3Q     Max 
> -110.30  -35.80  -22.77   38.07  122.76 
>
> Coefficients:
>            Estimate Std. Error t value Pr(>|t|) 
> (Intercept)  13.5177   229.0764   0.059   0.9535 
> N_alt         0.2832     0.1501   1.886   0.0739 .
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
>
> Residual standard error: 56.77 on 20 degrees of freedom
>  (16 observations deleted due to missingness)
> Multiple R-squared: 0.151, Adjusted R-squared: 0.1086 
> F-statistic: 3.558 on 1 and 20 DF,  p-value: 0.07386 
>
> The regression is not very good (high p-value, low R-squared).

Yes.

> The Pr value for the intercept seems to indicate that it is zero with a
> very high probability (95.35%).

Not quite. Consult your statistics textbook for the correct interpretation 
of p-values. Under the null hypothesis of a true intercept of zero, it is 
very likely to observe an intercept as large as 13.52 or larger.

> So I repeat the regression forcing the intercept to zero:

Do you have a good interpretation for that?

> Call:
> lm(formula = N ~ N_alt - 1)
>
> Residuals:
>    Min      1Q  Median      3Q     Max 
> -110.11  -36.35  -22.13   38.59  123.23 
>
> Coefficients:
>      Estimate Std. Error t value Pr(>|t|) 
> N_alt 0.292046   0.007742   37.72   <2e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
>
> Residual standard error: 55.41 on 21 degrees of freedom
>  (16 observations deleted due to missingness)
> Multiple R-squared: 0.9855, Adjusted R-squared: 0.9848 
> F-statistic:  1423 on 1 and 21 DF,  p-value: < 2.2e-16 
>
> 1. Is my interpretation correct?
> 2. Is it possible that just by forcing the intercept to become zero, a
> bad regression becomes an extremely good one?
> 3. Why doesn't lm suggest a value of zero (or near zero) by itself if
> the regression is so much better with it?

The model without intercept needs to be interpreted differently. The 
p-value pertains to a regression with intercept zero and slope 0.292 
against a model with both intercept zero and slope zero. If I had to 
guess, I would say this is not a very meaningful comparison for your data. 
The same is true for the R-squared (see also ?summary.lm for its 
definition in the case without intercept).

hth,
Z

> Please excuse my ignorance.
>
> Jan Rheinländer
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.