[R] linear regression

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Aug 9 12:21:15 CEST 2004


Donald Lehmann <donald.lehmann at pharmacology.oxford.ac.uk> writes:

> Dear Consultant
> 
> I've done linear regression successfully on R a few times before.  But
> this time it keeps telling me:-
> 
> "Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
>          0 (non-NA) cases"
> 
> The model is:-
> 
> fm1 <- lm(TS.CM ~ AGE + SEX + HFE.Y.01 + TFC2B.01 + HFE.Y.01*TFC2B.01,
> data = IRONresults, subset = DIAG2.1D == 0)
> summary (fm1)
> 
> TS.CM is a continuous variable (%s), sex is coded 0 = women, 1 = men,
> DIAG2.1D is coded 0 = non-demented, 1 = ALzheimer's disease and the
> genes, HFE.Y.01 & TFC2B.01, are coded 0 = non-carrier and 1 = carrier
> 
> I've tried recoding the data to use 1 & 2, instead of 0 & 1, and I've
> removed the rows with missing data.  I've also tried putting
> "...lm(formula = TS.CM ~ ..."  But I always get the same error message
> 
> What am I doing wrong?


You don't need to give the main effects when there's a "*" term
(that's a SASism, the R equivalent is ":" and a*b == a+b+a:b by
definition), but that is hardly the main problem.

Could you have a look at this? :

with(IRONresults, complete.cases(TS.CM, AGE, SEX, HFE.Y.01, TFC2B.01))

If you get all FALSE, you'll know what hit you...
 
> A related question: what's the minimum no of data points for
> regression analysis to work?  We have only 23 cases carrying both
> genes out of 447 and only 8 out of 264 in the above subset (ie
> non-demented).  I seem to remember hearing somewhere that you needed a
> minimum of ~30 (?), so probably this wouldn't work anyway.  Still, I'd
> like to know what I was doing wrong!

Technically, you just need linearly independent predictors and more
observations than parameters (incl. the intercept). Other bounds get
bandied about on what should be required for a *meaningful* analysis
(like "10 observations per parameter"), but these are quite heuristic
and empirical in nature.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list