[R] Replace rows in dataframe based on values in other columns
arun
smartpink111 at yahoo.com
Thu May 9 13:09:37 CEST 2013
Hi,
dat1<- read.table(text="
Restaurant owner purchase_date
23 Chuck 3/4/2011
23 Chuck 3/4/2011
23 Chuck 3/4/2011
23 Chuck 3/4/2011
23 Bob 1/1/2013
23 Bob 1/1/2013
23 Bob 1/1/2013
15 Hazel 4/11/2010
15 Hazel 4/11/2010
15 Hazel 4/11/2010
15 Hazel 4/11/2010
17 Pete 9/2/2012
17 Pete 9/2/2012
17 Pete 9/2/2012
17 Pete 9/2/2012
",sep="",header=TRUE,stringsAsFactors=FALSE)
#if the dates are in order for the Restaurant group,
dat1$Current_Owner<-with(dat1,ave(owner,Restaurant,FUN=function(x) tail(x,1)))
dat1
# Restaurant owner purchase_date Current_Owner
#1 23 Chuck 3/4/2011 Bob
#2 23 Chuck 3/4/2011 Bob
#3 23 Chuck 3/4/2011 Bob
#4 23 Chuck 3/4/2011 Bob
#5 23 Bob 1/1/2013 Bob
#6 23 Bob 1/1/2013 Bob
#7 23 Bob 1/1/2013 Bob
#8 15 Hazel 4/11/2010 Hazel
#9 15 Hazel 4/11/2010 Hazel
#10 15 Hazel 4/11/2010 Hazel
#11 15 Hazel 4/11/2010 Hazel
#12 17 Pete 9/2/2012 Pete
#13 17 Pete 9/2/2012 Pete
#14 17 Pete 9/2/2012 Pete
#15 17 Pete 9/2/2012 Pete
If the order is different:
dat2<- read.table(text="
Restaurant owner purchase_date
23 Bob 1/1/2013
23 Bob 1/1/2013
23 Bob 1/1/2013
23 Chuck 3/4/2011
23 Chuck 3/4/2011
23 Chuck 3/4/2011
23 Chuck 3/4/2011
15 Hazel 4/11/2010
15 Hazel 4/11/2010
15 Hazel 4/11/2010
15 Hazel 4/11/2010
17 Pete 9/2/2012
17 Pete 9/2/2012
17 Pete 9/2/2012
17 Pete 9/2/2012
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2New<-unsplit(lapply(split(dat2,dat2$Restaurant), FUN= function(x) {x1<-x[order(as.Date(x$purchase_date,format="%m/%d/%Y")),]; x1$Current_Owner<- tail(x1$owner,1); x1}),dat2$Restaurant)
rownames(dat2New)<- 1:nrow(dat2New)
dat2New
# Restaurant owner purchase_date Current_Owner
#1 23 Chuck 3/4/2011 Bob
#2 23 Chuck 3/4/2011 Bob
#3 23 Chuck 3/4/2011 Bob
#4 23 Chuck 3/4/2011 Bob
#5 23 Bob 1/1/2013 Bob
#6 23 Bob 1/1/2013 Bob
#7 23 Bob 1/1/2013 Bob
#8 15 Hazel 4/11/2010 Hazel
#9 15 Hazel 4/11/2010 Hazel
#10 15 Hazel 4/11/2010 Hazel
#11 15 Hazel 4/11/2010 Hazel
#12 17 Pete 9/2/2012 Pete
#13 17 Pete 9/2/2012 Pete
#14 17 Pete 9/2/2012 Pete
#15 17 Pete 9/2/2012 Pete
A.K.
>I have the following data frame including restaurants (id#) ,owner and
purchase date. Sometimes the restaurant changes ownership, but for
>analysis purposes I want to create another column which only keeps the
name of the latest owner, this is determined by the "purchase date". How
>can I create this new column? In this example restaurant 23 changes
hand on 1/1/2013, so I want to include only the new owner's name (Bob)
as >the current owner for all rows for this restaurant as shown below. If
the restaurant does not change hands, keep the same name for the
"current >owner" column as the "owner" column.
>
>Thanks,
>Johnny
>
>
>
>Restaurant owner purchase date
> 23 Chuck 3/4/2011
> 23 Chuck 3/4/2011
> 23 Chuck 3/4/2011
> 23 Chuck 3/4/2011
> 23 Bob 1/1/2013
> 23 Bob 1/1/2013
> 23 Bob 1/1/2013
> 15 Hazel 4/11/2010
> 15 Hazel 4/11/2010
> 15 Hazel 4/11/2010
> 15 Hazel 4/11/2010
> 17 Pete 9/2/2012
> 17 Pete 9/2/2012
> 17 Pete 9/2/2012
> 17 Pete 9/2/2012
>
>Restaurant owner purchase date Current owner
> 23 Chuck 3/4/2011 Bob
> 23 Chuck 3/4/2011 Bob
> 23 Chuck 3/4/2011 Bob
> 23 Chuck 3/4/2011 Bob
> 23 Bob 1/1/2013 Bob
> 23 Bob 1/1/2013 Bob
> 23 Bob 1/1/2013 Bob
> 15 Hazel 4/11/2010 Hazel
> 15 Hazel 4/11/2010 Hazel
> 15 Hazel 4/11/2010 Hazel
> 15 Hazel 4/11/2010 Hazel
> 17 Pete 9/2/2012 Pete
> 17 Pete 9/2/2012 Pete
> 17 Pete 9/2/2012 Pete
> 17 Pete 9/2/2012 Pete
More information about the R-help
mailing list