[R] probem on merge data
Ista Zahn
istazahn at gmail.com
Fri Nov 6 13:19:19 CET 2009
Hi,
So you want to randomly throw away data? Doesn't sound like a good idea to me...
You can get the combined data set using
data3 <- merge(data2, data1, all=TRUE)
>From there it's just a matter of randomly deleting rows in which the
combination of areiad, x1 and x2 are duplicated. I'll leave that to
you, but I encourage you to think about whether this is really what
you want.
-Ista
On Thu, Nov 5, 2009 at 11:34 PM, rusers.sh <rusers.sh at gmail.com> wrote:
> Hi there,
> data1<-matrix(data=c(1,1.2,1.3,"3/23/2004",1,1.5,2.3,"3/22/2004",2,0.2,3.3,"4/23/2004",3,1.5,1.3,"5/22/2004"),nrow=4,ncol=4,byrow=TRUE)
> data1<-data.frame(data1)
> names(data1)<-c("areaid","x","y","date")
> data1
>
> areaid x y date
> 1 1 1.2 1.3 3/23/2004
> 2 1 1.5 2.3 3/22/2004
> 3 2 0.2 3.3 4/23/2004
> 4 3 1.5 1.3 5/22/2004
> data2<-matrix(data=c(1,1.22,1.32,1, 1.53, 2.34,1, 1.21, 1.37,1, 1.52,
> 2.35,2, 0.21, 3.33,2, 0.23, 3.35,3, 1.57, 1.31,3, 1.59,
> 1.33),nrow=8,ncol=3,byrow=TRUE)
> data2<-data.frame(data2)
> names(data2)<-c("areaid","x1","y1")
> data2
>
> areaid x1 y1
> 1 1 1.22 1.32
> 2 1 1.53 2.34
> 3 1 1.21 1.37
> 4 1 1.52 2.35
> 5 2 0.21 3.33
> 6 2 0.23 3.35
> 7 3 1.57 1.31
> 8 3 1.59 1.33
> Explains the two data. You can treat data1 as case dataset and data2 as
> control dataset,respectively.Note th number of recodes for data2 are 2 times
> as that of data1 for each records,something like 1:2 matched case-control
> study design. I hope to merge data1 and data2. Take areaid=1 as an example.
> >From the two dataset, we can see that data1 has two points(x,y) in areaid=1,
> and data2 has four points (x1,y1) in areaid=1. Each record in data1 will
> have two matched records in data2.I want to randomly select 1/2 points of
> areaid=1 in data2 to link the one record of areaid=1 in the data1, and the
> other 1/2 points of areaid=1 in data2 to link the other record of areaid=1
> in the data1.Actually,the number of records in the same areaid will be over
> 2 in the actual dataset. This is only an example to explain the problem.
> For the cases of areaid=2 or 3,they are a little easier than areaid=1
> because there are only one value in data1.
> The final results are something like the following dataset.
> areaid x1 y1 date x y
> 1 1.22 1.32 3/23/2004 1.2 1.3
> 1 1.53 2.34 3/22/2004 1.2 1.3
> 1 1.21 1.37 3/23/2004 1.5 2.3
> 1 1.52 2.35 3/22/2004 1.5 2.3
> 2 0.21 3.33 4/23/2004 0.2 3.3
> 2 0.23 3.35 4/23/2004 0.2 3.3
> 3 1.57 1.31 5/22/2004 1.5 1.3
> 3 1.59 1.33 5/22/2004 1.5 1.3
>
> Any suggestions or help are greatly appreciated.
> Thanks a lot.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org
More information about the R-help
mailing list