[R] group by and merge two dataframes

Rui Barradas ruipbarradas at sapo.pt
Thu May 8 13:21:25 CEST 2014


Hello,

There are some alternatives without using sqldf or another package.

1.

tmp2 <- aggregate(item ~ id, data = df1, FUN = unique)

Then merge() like you've done.

2.

tmp3 <- merge(df1, df2)
tmp3[!duplicated(tmp3), ]


Hope this helps,

Rui Barradas

Em 08-05-2014 10:44, Massimo Bressan escreveu:
> given this "bare bone" example:
>
> df1 <- data.frame(id=rep(1:3,each=2), item=c(rep("A",2), rep("B",2),
> rep("C",2)))
> df2 <- data.frame(id=c(1,2,3), who=c("tizio","caio","sempronio"))
>
> I need to group the first dataframe "df1" by "id" and then merge with
> the second dataframe "df2" (again by "id")
> so far I've manged to accomplish the task by something like the
> following...
>
> # start
>
> require(sqldf)
> tmp<-sqldf("select * from df1 group by id")
> merge(tmp, df2)
>
> #end
>
> now I'm wonderng if there is a more efficient and/or elegant way to
> perform it (also because in fact I'm dealing with much more "heavy"
> dataframes);
>
> may be possible through a single sql statement?  or by using a different
> package functions (e.g. dplyr)?
> my attempts towards these alternative approaches miserably failed ...
>
> thanks
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list