[R] Conditional logistic regression for "events/trials" format

Charles C. Berry cberry at tajo.ucsd.edu
Thu May 31 19:11:50 CEST 2007

On Thu, 31 May 2007, Strickland, Matthew (CDC/CCHP/NCBDDD) (CTR) wrote:

> Dear R users,
> I have a large individual-level dataset (~700,000 records) which I am
> performing a conditional logistic regression on. Key variables include
> the dichotomous outcome, dichotomous exposure, and the stratum to which
> each person belongs.
> Using this individual-level dataset I can successfully use clogit to
> create the model I want. However reading this large .csv file into R and
> running the models takes a fair amount of time.
> Alternatively, I could choose to "collapse" the dataset so that each row
> has the number of events, number of individuals, and the exposure and
> stratum. In SAS they call this the "events/trials" format. This would
> make my dataset much smaller and presumably speed things up.

I think you have described the data for forming a 2 by 2 by K table of 

In which case, loglin(), loglm(), mantelhaen.test(), and - if K is not too 
large - glm(... , family=poisson)  would be suitable.

But you say 'models' above suggesting that there are some other 
variables. If so, you need to be a bit more specific in describing your 

> So my question is: can I use clogit (or possibly another function) to
> perform a conditional logistic regression when the data is in this
> "events/trials" format? I am using R version 2.5.0.
> Thank you very much,
> Matt Strickland
> Birth Defects Branch
> U.S. Centers for Disease Control
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901

More information about the R-help mailing list