[R] dataframe operation
Gabor Grothendieck
ggrothendieck at gmail.com
Wed Jan 24 22:21:09 CET 2007
Here is a slight variation on Marc's idea:
isna <- is.na(DF)
DF[] <- replace(100 * col(isna), isna, NA)
On 1/24/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
> On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> > On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:
> > > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > > hi
> > > > i have a dataframe "a" which looks like:
> > > >
> > > > column1, column2, column3
> > > > 10,12, 0
> > > > NA, 0,1
> > > > 12,NA,50
> > > >
> > > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > > >
> > > > any idea i can do it?
> > > >
> > > > i appreciate any hint
> > > > regards
> > > > lukas
> > > >
> > >
> > > Here is one possibility:
> > >
> > > > sapply(seq(along = colnames(DF)),
> > > function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> > > [,1] [,2] [,3]
> > > [1,] 10 12 0
> > > [2,] 100 0 1
> > > [3,] 12 200 50
> > >
> > >
> > > Note that the returned object will be a matrix, so if you need a data
> > > frame, just coerce the result with as.data.frame().
> >
> > OK....that's what I get for pulling the trigger too fast.
> >
> > Just reverse the logic in the function:
> >
> > > sapply(seq(along = colnames(DF)),
> > function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
> > [,1] [,2] [,3]
> > [1,] 100 200 300
> > [2,] NA 200 300
> > [3,] 100 NA 300
> >
> >
> > I misread the query initially.
>
> Here is another possibility, which may be faster depending upon the
> actual size and dims of your initial data frame.
>
> Preallocate a matrix of replacement values:
>
> Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),
> ncol = ncol(DF))
>
> > Mat
> [,1] [,2] [,3]
> [1,] 100 200 300
> [2,] 100 200 300
> [3,] 100 200 300
>
>
> Now do the replacement:
>
> > ifelse(!is.na(DF), Mat, NA)
> column1 column2 column3
> 1 100 200 300
> 2 NA 200 300
> 3 100 NA 300
>
>
> In doing some testing, the above may be about 10 times faster than using
> sapply() in my first solution, again depending upon the structure of
> your DF.
>
> HTH,
>
> Marc
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list