[R] Identifying common prefixes from a vector of words, and delete those prefixes

Christos Hatzis christos.hatzis at nuverabio.com
Thu Jul 31 19:16:49 CEST 2008


A more general solution:

strip.fun <- function(x, split=".") {
	xx <- strsplit(x, split, fixed=TRUE)
 	txx <- table(unlist(xx))
	nxx <- names(txx)[txx > 1]
	setdiff(unlist(xx), nxx)
}

> x <- c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal")
> strip.fun(x)
[1] "dog" "cat" "rat"

> y <- c("my_cat_pet", "my_dog_pet", "my_rat_pet")
> strip.fun(y, "_") 
[1] "cat" "dog" "rat"


-Christos

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of John Kane
> Sent: Thursday, July 31, 2008 12:48 PM
> To: r-help at stat.math.ethz.ch; Daren Tan
> Subject: Re: [R] Identifying common prefixes from a vector of 
> words,and delete those prefixes
> 
> There MUST be a better way but this will work. 
> 
> x <- c("dog.is.an.animal", "cat.is.an.animal", 
> "rat.is.an.animal") bb <- strsplit(x, "\\.") myfun <- 
> function(m) m[1] animals  <- unlist(lapply(bb, myfun)) animals
> 
> 
> 
> 
> --- On Thu, 7/31/08, Daren Tan <daren76 at hotmail.com> wrote:
> 
> > From: Daren Tan <daren76 at hotmail.com>
> > Subject: [R] Identifying common prefixes from a vector of 
> words, and 
> > delete those prefixes
> > To: r-help at stat.math.ethz.ch
> > Received: Thursday, July 31, 2008, 7:11 AM For example, 
> > c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal").
> > How can I identify the common prefix is ".is.an.animal" and 
> delete it 
> > to give c("dog", "cat", "rat") ?
> >  
> > Thanks
> > _________________________________________________________________
> > 
> > 
> > 	[[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
>       
> __________________________________________________________________
> [[elided Yahoo spam]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>



More information about the R-help mailing list