[R] Logistic Regression with 200K features in R?

Eik Vettorazzi E.Vettorazzi at uke.de
Thu Dec 12 12:12:16 CET 2013


it is simply because you can't do a regression with more predictors than
observations.

Cheers.

Am 12.12.2013 09:00, schrieb Romeo Kienzler:
> Dear List,
> 
> I'm quite new to R and want to do logistic regression with a 200K
> feature data set (around 150 training examples).
> 
> I'm aware that I should use Naive Bayes but I have a more general
> question about the capability of R handling very high dimensional data.
> 
> Please consider the following R code where "mygenestrain.tab" is a 150
> by 200000 matrix:
> 
> traindata <- read.table('mygenestrain.tab');
> mylogit <- glm(V1 ~ ., data = traindata, family = "binomial");
> 
> When executing this code I get the following error:
> 
> Error in terms.formula(formula, data = data) :
>   allocMatrix: too many elements specified
> Calls: glm ... model.frame -> model.frame.default -> terms -> terms.formula
> Execution halted
> 
> Is this because R can't handle 200K features or am I doing something
> completely wrong here?
> 
> Thanks a lot for your help!
> 
> best Regards,
> 
> Romeo
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790
--

Besuchen Sie uns auf: www.uke.de
_____________________________________________________________________

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg
Vorstandsmitglieder: Prof. Dr. Christian Gerloff (Vertreter des Vorsitzenden), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Rainer Schoppik
_____________________________________________________________________

SAVE PAPER - THINK BEFORE PRINTING



More information about the R-help mailing list