[R] linear regression
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Mon Aug 9 12:21:15 CEST 2004
Donald Lehmann <donald.lehmann at pharmacology.oxford.ac.uk> writes:
> Dear Consultant
>
> I've done linear regression successfully on R a few times before. But
> this time it keeps telling me:-
>
> "Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
> 0 (non-NA) cases"
>
> The model is:-
>
> fm1 <- lm(TS.CM ~ AGE + SEX + HFE.Y.01 + TFC2B.01 + HFE.Y.01*TFC2B.01,
> data = IRONresults, subset = DIAG2.1D == 0)
> summary (fm1)
>
> TS.CM is a continuous variable (%s), sex is coded 0 = women, 1 = men,
> DIAG2.1D is coded 0 = non-demented, 1 = ALzheimer's disease and the
> genes, HFE.Y.01 & TFC2B.01, are coded 0 = non-carrier and 1 = carrier
>
> I've tried recoding the data to use 1 & 2, instead of 0 & 1, and I've
> removed the rows with missing data. I've also tried putting
> "...lm(formula = TS.CM ~ ..." But I always get the same error message
>
> What am I doing wrong?
You don't need to give the main effects when there's a "*" term
(that's a SASism, the R equivalent is ":" and a*b == a+b+a:b by
definition), but that is hardly the main problem.
Could you have a look at this? :
with(IRONresults, complete.cases(TS.CM, AGE, SEX, HFE.Y.01, TFC2B.01))
If you get all FALSE, you'll know what hit you...
> A related question: what's the minimum no of data points for
> regression analysis to work? We have only 23 cases carrying both
> genes out of 447 and only 8 out of 264 in the above subset (ie
> non-demented). I seem to remember hearing somewhere that you needed a
> minimum of ~30 (?), so probably this wouldn't work anyway. Still, I'd
> like to know what I was doing wrong!
Technically, you just need linearly independent predictors and more
observations than parameters (incl. the intercept). Other bounds get
bandied about on what should be required for a *meaningful* analysis
(like "10 observations per parameter"), but these are quite heuristic
and empirical in nature.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list