[R] Creating binary variable depending on strings of two dataframes
David Winsemius
dwinsemius at comcast.net
Tue Dec 7 18:30:56 CET 2010
On Dec 7, 2010, at 11:30 AM, Pete Pete wrote:
>
> Hi,
> consider the following two dataframes:
> x1=c("232","3454","3455","342","13")
> x2=c("1","1","1","0","0")
> data1=data.frame(x1,x2)
>
> y1=c("232","232","3454","3454","3455","342","13","13","13","13")
> y2=c("E1","F3","F5","E1","E2","H4","F8","G3","E1","H2")
> data2=data.frame(y1,y2)
>
> I need a new column in dataframe data1 (x3), which is either 0 or 1
> depending if the value "E1" in y2 of data2 is true while x1=y1. The
> result
> of data1 should look like this:
> x1 x2 x3
> 1 232 1 1
> 2 3454 1 1
> 3 3455 1 0
> 4 342 0 0
> 5 13 0 1
>
> I think a SQL command could help me but I am too inexperienced with
> it to
> get there.
> dat3 <- merge(data1, data2[data2$y2=="E1", ], by.x="x1", by.y="y1",
all.x=TRUE)
> dat3$y2 <- 0 + (dat3$y2 %in% "E1")
> dat3
x1 x2 y2
1 13 0 1
2 232 1 1
3 342 0 0
4 3454 1 1
5 3455 1 0
(Admittedly not in the original order, but in my hands the R merge
operation doesn't lend itself well to maintaining the original order.
I see that Grothendieck's solution is better in this respect, a
typical occurrence in comparison of our respective efforts with R.)
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list