[R] Model Selection based on individual t-values with the largest possible number of variables in regression

Frank Harrell f.harrell at vanderbilt.edu
Wed Apr 3 18:31:32 CEST 2013

To say that these strategies represent bad statistical practice is to put it

mister_O wrote
> Dear R-Community,
> When writing my master thesis, I faced with difficult issue. Analyzing the
> capital structure determinants I have one dependent variable (Total debt
> ratio = TD) and 15 independent ones. At the first stage  I normalized my
> data by deleting outliers from each variable (Pairwise deletion) and in
> the result I got every variable to be  with different length. Now when
> selecting relevant variables for the "best" model, neither stepwise nor
> forward or backward procedures don't work perfectly since there are a
> number of other combinations of variables wich have also high t-values.
> Thus, wichever model I pick, you never know whether this model is
> trustworthy. I tried to calculate all possible combinations of independent
> variables, but since I have 15 ones, there are thousands of such
> combinations and R simply refuses to calculate them! (computer crashes) I
> wonder if you can help me to write the code in R in order to find the
> model wich include as many variables as it possible with significant
> t-values? 
> cheers, Oleg

Frank Harrell
Department of Biostatistics, Vanderbilt University
View this message in context: http://r.789695.n4.nabble.com/Model-Selection-based-on-individual-t-values-with-the-largest-possible-number-of-variables-in-regresn-tp4663174p4663202.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list