[R] Using indexing to manipulate data
William Dunlap
wdunlap at tibco.com
Thu Mar 18 17:57:47 CET 2010
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
> Sent: Thursday, March 18, 2010 1:33 AM
> To: duncandonutz
> Cc: r-help at r-project.org
> Subject: Re: [R] Using indexing to manipulate data
>
> On 03/18/2010 04:05 PM, duncandonutz wrote:
> >
> > I know one of R's advantages is it's ability to index,
> eliminating the need
> > for control loops to select relevant data, so I thought
> this problem would
> > be easy. I can't crack it. I have looked through past
> postings, but
> > nothing seems to match this problem
> >
> > I have a data set with one column of actors and one column
> of acts. I need
> > a list that will give me a pair of actors in each row,
> provided they both
> > participated in the act.
> >
> > Example:
> >
> > The Data looks like this:
> > Jim A
> > Bob A
> > Bob C
> > Larry D
> > Alice C
> > Tom F
> > Tom D
> > Tom A
> > Alice B
> > Nancy B
> >
> > I would like this:
> > Jim Bob
> > Jim Tom
> > Bob Alice
> > Larry Tom
> > Alice Nancy
> >
> > The order doesn't matter (Jim-Bob vs. Bob-Jim), but each
> pairing should be
> > counted only once.
You can use merge() to get all possible within-
group pairings and then eliminate the self-pairings
and the same-but-for-order pairings with the following
code:
> data <- read.table(header=FALSE, textConnection("
+ Jim A
+ Bob A
+ Bob C
+ Larry D
+ Alice C
+ Tom F
+ Tom D
+ Tom A
+ Alice B
+ Nancy B
+ ")) # column names are now V1 and V2
> # add seqence numbers for elimination step
> data$seq <- seq_len(nrow(data))
> tmp <- merge(data,data,by="V2")
> result <- tmp[tmp$seq.x < tmp$seq.y,] # omit unwanted pairings
> result
V2 V1.x seq.x V1.y seq.y
2 A Jim 1 Bob 2
3 A Jim 1 Tom 8
6 A Bob 2 Tom 8
11 B Alice 9 Nancy 10
15 C Bob 3 Alice 5
20 D Larry 4 Tom 7
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> Hi duncandonutz,
> Try this:
>
> actnames<-read.table("junkfunc/names.dat",stringsAsFactors=FALSE)
> actorpairs<-NULL
> for(act in unique(actnames$V2)) {
> actors<-actnames$V1[actnames$V2 == act]
> nactors<-length(actors)
> if(nactors > 1) {
> indices<-combn(nactors,2)
> for(i in 1:dim(indices)[2])
> actorpairs<-
> rbind(actorpairs,c(actors[indices[1,i]],actors[indices[2,i]]))
> }
> }
> actorpairs
>
> Jim
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list