[R] Odp: how to automatically select certain columns using for loop in dataframe
Petr PIKAL
petr.pikal at precheza.cz
Fri Apr 10 09:10:20 CEST 2009
Hi
I do not like complicated paste cycles too much so I would prefer
for (i in 1:4) print(na.omit(all.data[ ,last.char(names(all.data)) %in%
col_names[i] ]))
with last.char function like this
last.char<-function(x) substring(x, first=nchar(x), last=nchar(x))
Regards
Petr
r-help-bounces at r-project.org napsal dne 10.04.2009 00:30:37:
> Hi,
>
> I am trying to display / print certain columns in my data frame that
share
> certain condition (for example, part of the column name). I am using for
> loop, as follow:
>
> # below is the sample data structure
> all.data <- data.frame( NUM_A = 1:5, NAME_A = c("Andy", "Andrew",
"Angus",
> "Alex", "Argo"),
> NUM_B = 1:5, NAME_B = c(NA, "Barn", "Bolton",
> "Bravo", NA),
> NUM_C = 1:5, NAME_C = c("Candy", NA, "Cecil",
> "Crayon", "Corey"),
> NUM_D = 1:5, NAME_D = c("David", "Delta", NA,
NA,
> "Dummy") )
>
> col_names <- c("A", "B", "C", "D")
>
> > all.data
> NUM_A NAME_A NUM_B NAME_B NUM_C NAME_C NUM_D NAME_D
> 1 1 Andy 1 <NA> 1 Candy 1 David
> 2 2 Andrew 2 Barn 2 <NA> 2 Delta
> 3 3 Angus 3 Bolton 3 Cecil 3 <NA>
> 4 4 Alex 4 Bravo 4 Crayon 4 <NA>
> 5 5 Argo 5 <NA> 5 Corey 5 Dummy
> >
>
> Then for each col_names, I want to display the columns:
>
> for (each_name in col_names) {
>
> sub.data <- subset( all.data,
> !is.na( paste("NAME_", each_name, sep = '')
),
> select = c( paste("NUM_", each_name, sep =
'') ,
> paste("NAME_", each_name, sep = '') )
> )
> print(sub.data)
> }
>
> the "incorrect" result:
>
> NUM_A NAME_A
> 1 1 Andy
> 2 2 Andrew
> 3 3 Angus
> 4 4 Alex
> 5 5 Argo
> NUM_B NAME_B
> 1 1 <NA>
> 2 2 Barn
> 3 3 Bolton
> 4 4 Bravo
> 5 5 <NA>
> NUM_C NAME_C
> 1 1 Candy
> 2 2 <NA>
> 3 3 Cecil
> 4 4 Crayon
> 5 5 Corey
> NUM_D NAME_D
> 1 1 David
> 2 2 Delta
> 3 3 <NA>
> 4 4 <NA>
> 5 5 Dummy
> >
>
> What I want to achieve is that the result should only display the NUM
and
> NAME that is not NA. Here, the NA can be NULL, or zero (or other
specific
> values).
>
> the "correct" result:
>
> NUM_A NAME_A
> 1 1 Andy
> 2 2 Andrew
> 3 3 Angus
> 4 4 Alex
> 5 5 Argo
> NUM_B NAME_B
> 2 2 Barn
> 3 3 Bolton
> 4 4 Bravo
> NUM_C NAME_C
> 1 1 Candy
> 3 3 Cecil
> 4 4 Crayon
> 5 5 Corey
> NUM_D NAME_D
> 1 1 David
> 2 2 Delta
> 5 5 Dummy
> >
>
> I am guessing that I don't use the correct type on the following
statement
> (within the subset in the loop):
> !is.na( paste("NAME_", each_name, sep = '') )
>
> But then, I might be completely using a wrong approach.
>
> Any idea is definitely appreciated.
>
> Thank you,
>
> Ferry
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list