[R] Antwort: Buying more computer for GLM
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu Aug 31 13:44:28 CEST 2006
g.russell at eos-finance.com writes:
> Hello,
>
> at the moment I am doing quite a lot of regression, especially
> logistic regression, on 20000 or more records with 30 or more
> factors, using the "step" function to search for the model with the
> smallest AIC. This takes a lot of time on this 1.8 GHZ Pentium
> box. Memory does not seem to be such a big problem; not much
> swapping is going on and CPU usage is at or close to 100%. What
> would be the most cost-effective way to speed this up? The
> obvious way would be to get a machine with a faster processor (3GHz
> plus) but I wonder whether it might instead be better to run a dual-
> processor machine or something like that; this looks at least like a
> problem R should be able to parallelise, though I don't know whether it
> does.
Is this floating point bound? (When you say 30 factors does that mean
30 parameters or factors representing a much larger number of groups).
If it is integer bound, I don't think you can do much better than
increase CPU speed and - note - memory bandwidth (look for large-cache
systems and fast front-side bus). To increase floating point
performance, you might consider the option of using optimized BLAS
(see the Windows FAQ 8.2 and/or the "R Installation and
Administration" manual) like ATLAS; this in turn may be multithreaded
and make use of multiple CPUs or multi-core CPUs.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list