[R] Replace NAs in split lists
Ek Esawi
esawiek at gmail.com
Mon Jan 8 17:03:45 CET 2018
Thank you Jeff. Your code works, as usual , perfectly. I am just
wondering why if i put the whole code in one line, i get an error
message.
sdf2 <- lapply( sdf, function(z){z$Value
<-ifelse(is.na(z$Value),z$Value[!is.na(z$Value)][1],z$Value)z})
error. unexpected symbol in sdf2
Thanks again
EK
On Mon, Jan 8, 2018 at 3:12 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
> Upon closer examination I see that you are not using the split version of
> df1 as I usually would, so here is a reproducible example:
>
> #----
> df1 <- read.table( text=
> "ID ID_2 Firist Value
> 1 a aa TRUE 2
> 2 a ab FALSE NA
> 3 a ac FALSE NA
> 4 b aa TRUE 5
> 5 b ab FALSE NA
> ", header=TRUE, as.is=TRUE )
>
> sdf <- split( df1, df1$ID )
> # note the extra [ 1 ] in case you have more than one non-NA value # per ID
> sdf2 <- lapply( sdf
> , function( z ) {
> z$Value <- ifelse( is.na( z$Value )
> , z$Value[ !is.na( z$Value ) ][ 1 ]
> , z$Value
> )
> z
> }
> )
> df2 <- do.call( rbind, sdf2 )
> df2
> #> ID ID_2 Firist Value
> #> a.1 a aa TRUE 2
> #> a.2 a ab FALSE 2
> #> a.3 a ac FALSE 2
> #> b.4 b aa TRUE 5
> #> b.5 b ab FALSE 5
>
> # or using tidyverse methods
>
> library(dplyr)
> #>
> #> Attaching package: 'dplyr'
> #> The following objects are masked from 'package:stats':
> #>
> #> filter, lag
> #> The following objects are masked from 'package:base':
> #>
> #> intersect, setdiff, setequal, union
> df3 <- ( df1
> %>% group_by( ID )
> %>% do({
> mutate( .
> , Value = ifelse( is.na( Value )
> , Value[ !is.na( Value ) ][ 1 ]
> , Value
> )
> )
> })
> %>% ungroup
> )
> df3
> #> # A tibble: 5 x 4
> #> ID ID_2 Firist Value
> #> <chr> <chr> <lgl> <int>
> #> 1 a aa T 2
> #> 2 a ab F 2
> #> 3 a ac F 2
> #> 4 b aa T 5
> #> 5 b ab F 5
> #----
>
>
> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>
>> Why do you want to modify df1?
>>
>> Why not just reassemble the parts as a new data frame and use that going
>> forward in your calculations? That is generally the preferred approach in R
>> so you can re-do your calculations easily if you find a mistake later.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at gmail.com> wrote:
>>>
>>> I just came up with a solution right after i posted the question, but
>>> i figured there must be a better and shorter one.than my solution
>>> sdf1[[1]][1,4]<-lapplyresults[[1]]
>>> sdf1[[2]][1,4]<-lapplyresults[[2]]
>>>
>>> EK
>>>
>>> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at gmail.com> wrote:
>>>>
>>>> Hi all--
>>>>
>>>> I stumbled on this problem online. I did not like the solution given
>>>> there which was a long UDF. I thought why cannot split and l/s apply
>>>> work here. My aim is to split the data frame, use l/sapply, make
>>>> changes on the split lists and combine the split lists to new data
>>>> frame with the desired changes/output.
>>>>
>>>> The data frame shown below has a column named ID which has 2
>>>
>>> variables
>>>>
>>>> a and b; i want to replace the NAs on the Value column by 2, which is
>>>> the only numeric entry, for ID=a and by 5 for ID=b.
>>>>
>>>> I worked out the solution but could not replace the results in the
>>>
>>> split lists.
>>>>
>>>>
>>>> Original dataframe , df1
>>>> ID ID_2 Firist Value
>>>> 1 a aa TRUE 2
>>>> 2 a ab FALSE NA
>>>> 3 a ac FALSE NA
>>>> 4 b aa TRUE 5
>>>> 5 b ab FALSE NA
>>>> Sdf1
>>>> $a
>>>> ID ID_2 Firist Value
>>>> 1 a aa TRUE 2
>>>> 2 a ab FALSE NA
>>>> 3 a ac FALSE NA
>>>> $b
>>>> ID ID_2 Firist Value
>>>> 4 b aa TRUE 5
>>>> 5 b ab FALSE NA
>>>> Desired results
>>>> ID ID_2 Firist Value
>>>> 1 a aa TRUE 2
>>>> 2 a ab FALSE 2
>>>> 3 a ac FALSE 2
>>>>
>>>> $b
>>>> ID ID_2 Firist Value
>>>> 4 b aa TRUE 5
>>>> 5 b ab FALSE 5
>>>>
>>>> My code
>>>>
>>>> sdf <- split(df1,df$ID)
>>>> lapply(sdf, function(z)
>>>
>>> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>>>>
>>>> result:
>>>> $ a: num [1:3] 2 2 2
>>>> $ b: num [1:2] 5 5
>>>>
>>>> How could I put these two lists back in the split data frame, sdf1?
>>>> Then I could use do.call to reassemble a data frame from the split
>>>> lists,
>>>>
>>>> Thanks,
>>>> EK
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ---------------------------------------------------------------------------
More information about the R-help
mailing list