[R] data frames; matching/merging
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Feb 8 18:11:16 CET 2010
Here are 3 solutions assuming DF contains the data frame:
> # 1. aggregate
> aggregate(DF[2], DF[1], min)
V1 V2
1 a 2
2 b 9
3 c 4
> # 2. aggregate.formula - requires R 2.11.x
> aggregate(V2 ~ V1, DF, min)
V1 V2
1 a 2
2 b 9
3 c 4
> # 3. SQL using sqldf
> library(sqldf)
> sqldf("select V1, min(V2) V2 from DF group by V1")
V1 V2
1 a 2
2 b 9
3 c 4
> # 4. summaryBy in the doBy package
> library(doBy)
> summaryBy(V2 ~., DF, FUN = min, keep.names = TRUE)
V1 V2
1 a 2
2 b 9
3 c 4
On Mon, Feb 8, 2010 at 11:39 AM, Jonathan <jonsleepy at gmail.com> wrote:
> Hi all,
> I'm feeling a little guilty to ask this question, since I've
> written a solution using a rather clunky for loop that gets the job
> done. But I'm convinced there must be a faster (and probably more
> elegant) way to accomplish what I'm looking to do (perhaps using the
> "merge" function?). I figured somebody out there might've already
> figured this out:
>
> I have a dataframe with two columns (let's call them V1 and V2). All
> rows are unique, although column V1 has several redundant entries.
>
> Ex:
>
> V1 V2
> 1 a 3
> 2 a 2
> 3 b 9
> 4 c 4
> 5 a 7
> 6 b 11
>
>
> What I'd like is to return a dataframe cut down to have only unique
> entires in V1. V2 should contain a vector, for each V1, that is the
> minimum of all the possible choices from the set of redundant V1's.
>
> Example output:
>
> V1 V2
> 1 a 2
> 2 b 9
> 3 c 4
>
>
> If somebody could (relatively easily) figure out how to get closer to
> a solution, I'd appreciate hearing how. Also, I'd be interested to
> hear how you came upon the answer (so I can get better at searching
> the R resources myself).
>
> Regards,
> Jonathan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list