[R] R-squared value for linear regression passing through origin using lm()

S Ellison S.Ellison at lgc.co.uk
Thu Oct 18 15:00:49 CEST 2007

>I think there is reason to be surprised, I am, too. ...
>What am I missing?

Read the formula and ?summary.lm more closely. The denominator,

Sum((y[i]- y*)^2) 

is very large if the mean value of y is substantially nonzero and y*
set to 0 as the calculation implies for a forced zero intercept. In
effect, the calculation provides the fraction of sum of squared
deviations from the mean for the case with intercept, but the fraction
of sum of squared y ('about' zero) for the non-intercept case. 

This is surprising if you automatically assume that better R^2 means
better fit. I guess that explains why statisticians tell you not to use
R^2 as a goodness-of-fit indicator.

>>> Ralf Goertz <R_Goertz at web.de> 18/10/2007 13:11:55 >>>
>>   r.squared: R^2, the 'fraction of variance explained by the
> >
> >              R^2 = 1 - Sum(R[i]^2) / Sum((y[i]- y*)^2),
> >
>>             where y* is the mean of y[i] if there is an intercept
>>             zero otherwise.


R-help at r-project.org mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

This email contains information which may be confidential and/or privileged, and is intended only for the individual(s) or organisation(s) named above. If you are not the intended recipient, then please note that any disclosure, copying, distribution or use of the contents of this email is prohibited. Internet communications are not 100% secure and therefore we ask that you acknowledge this. If you have received this email in error, please notify the sender or contact +44(0)20 8943 7000 or postmaster at lgcforensics.com immediately, and delete this email and any attachments and copies from your system. Thank you. 

LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex TW11 0LY, UK

More information about the R-help mailing list