[R] Subsetting data by eliminating redundant variables

R. Michael Weylandt michael.weylandt at gmail.com
Wed Oct 19 16:16:39 CEST 2011


Assuming you are talking about redun() from the Hmisc package, it's
much easier than you are making it:

n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(n)/10
x4 <- x1 + x2 + x3 + runif(n)/10
x5 <- factor(sample(c('a','b','c'),n,replace=TRUE))
x6 <- 1*(x5=='a' | x5=='c')
data1 <- data.frame(x1,x2,x3,x4,x5,x6)

library(Hmisc)

V <- redun(~., data = data1, r2 = 0.8)

V$In

V$Out

Michael

On Wed, Oct 19, 2011 at 6:49 AM, aajit75 <aajit75 at yahoo.co.in> wrote:
> Dear All,
>
> I am new to R, I have one question which might be easy.
>
> I have a large data with more than 250 variable, i am reducing number of
> variables by redun function as in the example below,
>
> n <- 100
> x1 <- runif(n)
> x2 <- runif(n)
> x3 <- x1 + x2 + runif(n)/10
> x4 <- x1 + x2 + x3 + runif(n)/10
> x5 <- factor(sample(c('a','b','c'),n,replace=TRUE))
> x6 <- 1*(x5=='a' | x5=='c')
> data1 <- cbind(x1,x2,x3,x4,x5,x6)
> data2 <- data.frame(data1)
> outredun <- redun(~., data=data2, r2=.8,)
> outredun
> #outredun1 <- capture.output(redun(~., data=data2, r2=.8,))
> #outredun1
> #x25 <- outredun1[25]
> #mydata12 <- daat1[myvars] #myvars I need to pass to retain variables
>
> which gives me , say for this example  Rendundant variables:x6 x4 x3 and
> Predicted from variables: x1 x2 x5 as output in console.
>
> I want to subset my original data with either by keeping 'Predicted from
> variables' or by droping 'Rendundant variables'. I have tried using
> capture.output function as mentioned above in the commented code but it
> gives me a string like "x1 x2 x5 " which need to modify as "x1", "x2", "x3"
> as input to subset data.
>
> As my data has more than 250 variables and evry time data and nuber of
> variables are changing. How this can be achived?
>
> Thanks in advance for the help.
>
> Regards,
> -Ajit
>
>
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Subsetting-data-by-eliminating-redundant-variables-tp3918199p3918199.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list