[R] adding an additional column for preserving uniqueness
William Dunlap
wdunlap at tibco.com
Thu Jan 29 02:38:37 CET 2015
> with(dat1, ave(integer(length(Date)), Date, FUN=seq_along))
[1] 1 1 2 1 1 2 1 2 1 2 1
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Jan 28, 2015 at 4:54 PM, Morway, Eric <emorway at usgs.gov> wrote:
> The two datasets below are excerpts from much larger datasets. Note that
> there are duplicate dates in both dat1 and dat2, e.g., "2009-10-14".
>
> dat1 <- read.table(textConnection("Date ConcAve
> 2009-07-08 7
> 2009-08-26 1
> 2009-08-26 2
> 2009-09-15 2
> 2009-10-14 2
> 2009-10-14 2
> 2009-10-16 101
> 2009-10-16 93
> 2009-11-18 4
> 2009-11-18 3
> 2010-01-04 4"),header=T)
> closeAllConnections()
>
> dat2 <- read.table(textConnection("Date ConcAve
> 2009-08-26 4.84e-05
> 2009-09-15 4.58e-05
> 2009-10-14 3.86e-05
> 2009-10-14 3.55e-05
> 2009-10-16 3.07e-05
> 2009-10-16 2.35e-05
> 2009-11-18 2.00e-05
> 2009-11-18 1.96e-05
> 2010-01-04 1.52e-05
> 2010-01-04 1.53e-05
> 2010-02-10 2.23e-05"),header=T)
> closeAllConnections()
>
> I'm seeking an R operation that will append a third column to both
> data.frame's such that it makes these duplicates unique when I run merge().
> The desired result for dat1 would be:
>
> Date ConcAve item
> 2009-07-08 7 1
> 2009-08-26 1 1
> 2009-08-26 2 2
> 2009-09-15 2 1
> 2009-10-14 2 1
> 2009-10-14 2 2
> 2009-10-16 101 1
> 2009-10-16 93 2
> 2009-11-18 4 1
> 2009-11-18 3 2
> 2010-01-04 4 1
>
> this way, I don't get this:
>
> merge(dat1, dat2, by="Date")
> # Date ConcAve.x ConcAve.y
> #1 2009-08-26 1 4.84e-05
> #2 2009-08-26 2 4.84e-05
> #3 2009-09-15 2 4.58e-05
> #4 2009-10-14 2 3.55e-05
> #5 2009-10-14 2 3.86e-05
> #6 2009-10-14 2 3.55e-05
> #7 2009-10-14 2 3.86e-05
> #8 2009-10-16 101 3.07e-05
> #9 2009-10-16 101 2.35e-05
> #10 2009-10-16 93 3.07e-05
> #11 2009-10-16 93 2.35e-05
> #12 2009-11-18 4 1.96e-05
> #13 2009-11-18 4 2.00e-05
> #14 2009-11-18 3 1.96e-05
> #15 2009-11-18 3 2.00e-05
> #16 2010-01-04 4 1.52e-05
> #17 2010-01-04 4 1.53e-05
>
> With the new column, which I've inserted manually in this small example, I
> instead get the merge result below, which is what I'm after for the larger
> problem:
>
> dat3 <- read.table(textConnection("Date ConcAve item
> 2009-07-08 7 1
> 2009-08-26 1 1
> 2009-08-26 2 2
> 2009-09-15 2 1
> 2009-10-14 2 1
> 2009-10-14 2 2
> 2009-10-16 101 1
> 2009-10-16 93 2
> 2009-11-18 4 1
> 2009-11-18 3 2
> 2010-01-04 4 1"),header=T)
> closeAllConnections()
>
> dat4 <- read.table(textConnection("Date ConcAve item
> 2009-08-26 4.84e-05 1
> 2009-09-15 4.58e-05 1
> 2009-10-14 3.86e-05 1
> 2009-10-14 3.55e-05 2
> 2009-10-16 3.07e-05 1
> 2009-10-16 2.35e-05 2
> 2009-11-18 2.00e-05 1
> 2009-11-18 1.96e-05 2
> 2010-01-04 1.52e-05 1
> 2010-01-04 1.53e-05 2
> 2010-02-10 2.23e-05 1"),header=T)
> closeAllConnections()
>
> merge(dat3, dat4, by=c("Date","item"))
> # Date item ConcAve.x ConcAve.y
> #1 2009-08-26 1 1 4.84e-05
> #2 2009-09-15 1 2 4.58e-05
> #3 2009-10-14 1 2 3.86e-05
> #4 2009-10-14 2 2 3.55e-05
> #5 2009-10-16 1 101 3.07e-05
> #6 2009-10-16 2 93 2.35e-05
> #7 2009-11-18 1 4 2.00e-05
> #8 2009-11-18 2 3 1.96e-05
> #9 2010-01-04 1 4 1.52e-05
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list