[R] using match to obtain non-sorted index values from non-sortedvector
David Winsemius
dwinsemius at comcast.net
Wed Jul 9 23:01:29 CEST 2014
On Jul 9, 2014, at 1:13 PM, Folkes, Michael wrote:
> So nice!
> Apply wins again.
I doubt that `sapply( ..., which(,) )` would win a foot race with `match`:
> match(Tset, pop.df$pop)
[1] 5 4 2
--
David.
> Thanks David.
> Michael
>
> -----Original Message-----
> From: David L Carlson [mailto:dcarlson at tamu.edu]
> Sent: July-09-14 1:11 PM
> To: Folkes, Michael; r-help at r-project.org
> Subject: RE: using match to obtain non-sorted index values from
> non-sortedvector
>
> There may be a faster way, but
>
>> sapply(Tset, function(x) which(pop.df$pop==x))
> [1] 5 4 2
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Folkes, Michael
> Sent: Wednesday, July 9, 2014 2:58 PM
> To: r-help at r-project.org
> Subject: [R] using match to obtain non-sorted index values from
> non-sorted vector
>
> Hello all,
>
> I've been struggling with the best way to find index values from a large
> vector with elements that will match elements of a subset vector [the
> table argument in match()].
>
> BUT the index values can't come out sorted (as we'd get in which(X %in%
> Y) ).
>
> My 'population' vector can't be sorted.
>
> pop.df <- data.frame(pop=c(1,6,4,3,10))
>
> The subset: Tset = c(10,3,6)
>
>
>
> So I'd like to get these index values (from pop.df) , in this order:
> 5,4,2
>
>
>
> If it could be sorted I could use:
>
> which(sort(pop.df$pop) %in% sort(Tset))
>
>
>
> But sorting will cause more grief later, so best not mess with it.
>
> Here is my hopefully adequate MWE of a solution. I'm keen to see if
> anybody has a better suggestion.
>
> Thanks!
>
> _____________________
>
> ###BEGIN R
>
> #pop is the full set of values, it has no info on their ranking
>
> # I don't want to sort these data. They need to remain in this order.
>
> pop.df <- data.frame(pop=c(1,6,4,3,10))
>
>
>
> #rank.df is my dataframe that tells me the top three rankings (derived
> elsewhere)
>
> rank.df <- data.frame(rank=1:3, Tset = c(10,3,6)) # Target set
>
>
>
> #match.df will be my source of row index based on rank
>
> match.df <- data.frame(match.vec= match(pop.df$pop, table=rank.df$Tset),
> index.vec=1:nrow(pop.df))
>
>
> #rank.df will now include the index location in the pop.df where I can
> find the top three ranks.
>
> rank.df <- merge(rank.df, match.df, by.x='rank', by.y='match.vec')
>
> rank.df
>
>
> ####END
>
>
>
> _______________________________________________________
>
> Michael Folkes
>
> Salmon Stock Assessment
>
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list