[R] merging dataframes in a list

Ed Siefker ebs15242 at gmail.com
Fri Jun 3 22:02:13 CEST 2016


Thanks, ldply got me a data frame straight away.  But it filled empty
spaces with NA and merge no longer works.

> ldply(mylist)
     name red green
1 sample1  20    NA
2 sample1  NA    15
3 sample2  10    NA
4 sample2  NA    30
> mydf <- ldply(mylist)
> merge(mydf[1,],mydf[2,])
[1] name  red   green
<0 rows> (or 0-length row.names)
> merge(mydf[1,],mydf[2,], by=1)
     name red.x green.x red.y green.y
1 sample1    20      NA    NA      15


How do I merge dataframes with NA?

On Fri, Jun 3, 2016 at 2:17 PM, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
> You can use ldply in the plyr package to bind all the data.frames together
> (a regular loop will also work). Afterwards you can summarise using ddply
>
> Hope this helps
> Ulrik
>
>
> Ed Siefker <ebs15242 at gmail.com> schrieb am Fr., 3. Juni 2016 21:10:
>>
>> aggregate isn't really what I want.  Maybe tapply?  I still can't get
>> it to work.
>>
>> > length(mylist)
>> [1] 4
>> > length(names)
>> [1] 4
>> > tapply(mylist, names, merge)
>> Error in tapply(mylist, names, merge) : arguments must have same length
>>
>> I guess because a list isn't an atomic data type.  What function will
>> do the same on lists?  lapply doesn't have a 'by' argument.
>>
>> On Fri, Jun 3, 2016 at 1:41 PM, Ed Siefker <ebs15242 at gmail.com> wrote:
>> > I manually constructed the list of sample names and tried the
>> > aggregate call I mentioned.
>> > Merge works when called manually, but not when using aggregate.
>> >
>> >> mylist <- list(data.frame(name="sample1", red=20),
>> >> data.frame(name="sample1", green=15), data.frame(name="sample2", red=10),
>> >> data.frame(na me="sample2", green=30))
>> >>  names <- list("sample1", "sample1", "sample2", "sample2")
>> >> merge(mylist[1], mylist[2])
>> >      name red green
>> > 1 sample1  20    15
>> >> merge(mylist[3], mylist[4])
>> >      name red green
>> > 1 sample2  10    30
>> >> aggregate(mylist, by=as.list(names), merge)
>> > Error in as.data.frame(y) : argument "y" is missing, with no default
>> >
>> > What's the right way to do this?
>> >
>> > On Fri, Jun 3, 2016 at 1:20 PM, Ed Siefker <ebs15242 at gmail.com> wrote:
>> >> I have a list of data as follows.
>> >>
>> >>> list(data.frame(name="sample1", red=20), data.frame(name="sample1",
>> >>> green=15), data.frame(name="sample2", red=10), data.frame(name="sample 2",
>> >>> green=30))
>> >> [[1]]
>> >>      name red
>> >> 1 sample1  20
>> >>
>> >> [[2]]
>> >>      name green
>> >> 1 sample1    15
>> >>
>> >> [[3]]
>> >>      name red
>> >> 1 sample2  10
>> >>
>> >> [[4]]
>> >>      name green
>> >> 1 sample2    30
>> >>
>> >>
>> >> I would like to massage this into a data frame like this:
>> >>
>> >>      name red green
>> >> 1 sample1  20    15
>> >> 2 sample2  10    30
>> >>
>> >>
>> >> I'm imagining I can use aggregate(mylist, by=samplenames, merge)
>> >> right?  But how do I get the list of samplenames?  How do I subset
>> >> each dataframe inside the list?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list