[R] multiple regression w/ no intercept; strange results

Mon Jun 29 15:39:04 CEST 2009

On Sun, Jun 28, 2009 at 3:38 AM, Dieter
Menne<dieter.menne at menne-biomed.de> wrote:

>  It seems odd to me that dropping the intercept
> would cause the R^2 and F stats to rise so dramatically, and the p
> value to consequently drop so much.  In my implementation, I get the
> same beta1 and beta2, and the R2 I compute using the
>
> Removing the intercept can harm your sanity. See
>
> http://markmail.org/message/q67jf7uaig7d4tkm
>
> for an example.

I read the paper and the example so thanks for sending those along.
The paper made some good arguments from a modeling perspective why one
should keep the intercept -- the most convincing to me is that you
would like the modeling to be robust to a location and scale
transformation.

But my question was more numerical: in particular, the R^2 of the
model should be equal to the square of the correlation between the fit
values and the actual values.  It is with the intercept and is not w/o
it, as my code example shows.  Am I correct in assuming these should
always be the same, and if they are not, does it reflect a bug in R or
perhaps a numerical instability?

You also wrote in your post "There are reasons why the standard
textbooks...".  I read the reasons Venables addressed in his
"Exegeses", but none of these seem to address my particular concern.
Can you elaborate on these or provide additional links ?

Thanks!
JDH