[R] matching a sequence in a vector?
Petr Savicky
savicky at cs.cas.cz
Wed Feb 15 11:21:45 CET 2012
On Wed, Feb 15, 2012 at 10:26:44AM +0100, Berend Hasselman wrote:
>
> On 15-02-2012, at 05:17, Redding, Matthew wrote:
>
> > Hi All,
> >
> >
> > I've been trawling through the documentation and listserv archives on this topic -- but
> > as yet have not found a solution. I'm sure this is pretty simple with R, but I cannot work out how without
> > resorting to ugly nested loops.
> >
> > As far as I can tell, grep, match, and %in% are not the correct tools.
> >
> > Question:
> > given these vectors --
> > patrn <- c(1,2,3,4)
> > exmpl <- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4)
> >
> > how do I get the desired answer by finding the occurence of the pattern and returning the starting indices:
> > 6, 13, 23
> >
>
> patrn.rev <- rev(patrn)
> w <- embed(exmpl,length(patrn))
> w.pos <- apply(w,1,function(r) all(r == patrn.rev))
> which(w.pos)
Hi.
If the speed is an issue and exmpl is long, the
following modification may be faster.
patrn.rev <- rev(patrn)
w <- embed(exmpl,length(patrn))
which(rowSums(w == rep(patrn.rev, each=nrow(w))) == ncol(w))
[1] 6 13 23
For length(patrn) = 11 and length(exmpl) = 10000, i obtained
a speed up by a factor of 10.
Hope this helps.
How large are the vectors "patrn" and "exmpl" in your application?
Petr Savicky.
More information about the R-help
mailing list