[R] data frame manipulation with condition

William Dunlap wdunlap at tibco.com
Fri Feb 24 18:58:13 CET 2012


When a factor is used as a subscript it is treated
as its integer codes so explicit conversion to character
is needed if you want to subscript by names:
  > f <- factor(c("One","Three","Two"), levels=c("One","Two","Three"))
  > x <- c(Two=2, One=1, Three=3)
  > x[f]
    Two Three   One 
      2     3     1 
  > x[as.character(f)]
    One Three   Two 
      1     3     2
For most other functions (e.g., %in%, paste, sprintf("%s"))
you do not need an explicit conversion to character, but '['
requires you to choose.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -----Original Message-----
> From: Sarah Goslee [mailto:sarah.goslee at gmail.com]
> Sent: Friday, February 24, 2012 9:39 AM
> To: William Dunlap
> Cc: Arnaud Gaboury; r-help at r-project.org
> Subject: Re: [R] data frame manipulation with condition
> 
> On Fri, Feb 24, 2012 at 12:23 PM, William Dunlap <wdunlap at tibco.com> wrote:
> > Use mult[as.character(df$x)] instead of mult[df$x].
> > They are different when df$x is a factor and the
> > character version is what you want.
> 
> R will coerce a factor to character to perform the comparison; explicitly
> calling as.character() is not necessary:
> 
> > df$x
> [1] AA BB CC AA DD DD
> > df$x == "AA"
> [1]  TRUE FALSE FALSE  TRUE FALSE FALSE
> 
> See ?factor for details.
> 
> Sarah
> 
> >  > df<- data.frame(x = c("AA","BB","CC","AA","DD","DD"), y = 1:6)
> >  > mult <- c(AA = 10, BB = 25,DD=15)
> >  > df$y <- df$y * mult[as.character(df$x)]
> >  > df
> >     x  y
> >  1 AA 10
> >  2 BB 50
> >  3 CC NA
> >  4 AA 40
> >  5 DD 75
> >  6 DD 90
> >
> > This gets the order right.  The NA for "CC" is because
> > your vector of multipliers didn't include an entry for
> > CC.  You can either add CC=1 to mult or work only on the
> > subset of the data which has entries in the mult vector.
> >
> >  > df<- data.frame(x = c("AA","BB","CC","AA","DD","DD"), y = 1:6)
> >  > mult <- c(AA = 10, BB = 25,DD=15)
> >  > i <- as.character(df$x) %in% names(mult)
> >  > df$y[i] <- df$y[i] * mult[as.character(df$x[i])]
> >  > df
> >     x  y
> >  1 AA 10
> >  2 BB 50
> >  3 CC  3
> >  4 AA 40
> >  5 DD 75
> >  6 DD 90
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Arnaud
> Gaboury
> >> Sent: Friday, February 24, 2012 8:37 AM
> >> To: Uwe Ligges
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] data frame manipulation with condition
> >>
> >> > df<- data.frame(x = c("AA","BB","CC","AA","DD","DD"), y = 1:6)
> >> > mult <- c(AA = 10, BB = 25,DD=15)
> >> > df$y <- df$y * mult[df$x]
> >> > df
> >>    x  y
> >> 1 AA 10
> >> 2 BB 50
> >> 3 CC 45
> >> 4 AA 40
> >> 5 DD NA
> >> 6 DD NA
> >>
> >> My df is in fact much more longer than the chosen example shown here. It seems your tip didn't do
> the
> >> job.
> >> I am expecting this as result :
> >>
> >> > df
> >>    x  y
> >> 1 AA 10  ----> if df$x==AA, df$y<-1*10
> >> 2 BB 50   ----> if df$x==BB, df$y<-2*25
> >> 3 CC 3         NOTHING
> >> 4 AA 40    ----> if df$x==AA, df$y<-4*10
> >> 5 DD 75   ----> if df$x==DD, df$y<-5*15
> >> 6 DD 90   ----> if df$x==DD, df$y<-6*15
> >>
> >> Arnaud Gaboury
> >>
> >> A2CT2 Ltd.
> >>
> >> -----Original Message-----
> >> From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de]
> >> Sent: vendredi 24 février 2012 17:07
> >> To: Arnaud Gaboury
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] data frame manipulation with condition
> >>
> >>
> >>
> >> On 24.02.2012 16:59, Arnaud Gaboury wrote:
> >> > TY Uwe,
> >> >
> >> > So I will have to write a line for each condition? Right?
> >> >
> >> > In fact I was trying to do something with apply in one line, but couldn't achieve any result. In
> >> fact, all my transformation will be multiplying one object by a specific number according to the
> value
> >> of df$x.
> >>
> >> In that case:
> >>
> >> mult <- c(AA = 10, BB = 25)
> >>
> >> Then:
> >>
> >>
> >> df$y <- df$y * mult[df$x]
> >>
> >>
> >> Uwe Ligges
> >>
> >>
> >> >
> >> > Arnaud Gaboury
> >> >
> >> > A2CT2 Ltd.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de]
> >> > Sent: vendredi 24 février 2012 16:33
> >> > To: Arnaud Gaboury
> >> > Cc: r-help at r-project.org
> >> > Subject: Re: [R] data frame manipulation with condition
> >> >
> >> >
> >> >
> >> > On 24.02.2012 16:25, Arnaud Gaboury wrote:
> >> >> Dear list,
> >> >>
> >> >> n00b question, but still can't find any easy answer.
> >> >>
> >> >> Here is a df:
> >> >
> >> >
> >> > Change
> >> >
> >> >>> df<-data.frame(cbind(x=c("AA","BB","CC","AA"),y=1:4))
> >> >
> >> > to
> >> >
> >> >    df<- data.frame(x = c("AA","BB","CC","AA"), y = 1:4)
> >> >
> >> > to make your object a sensible data.frame.
> >> >
> >> >
> >> >
> >> >>> df
> >> >>      x y
> >> >> 1 AA 1
> >> >> 2 BB 2
> >> >> 3 CC 3
> >> >> 4 AA 4
> >> >>
> >> >>
> >> >> I want to modify this df this way :
> >> >>    if df$x=="AA" then df$y=df$y*10
> >> >
> >> > df$y[df$x=="AA"]<- df$y[df$x=="AA"] * 25
> >> >
> >> > ...
> >> >
> >> >
> >> > Uwe Ligges
> >> >
> >> >
> >> >>    if df$x=="BB" then df$y=df$y*25
> >> >
> >> >
> >> >
> >> >
> >> >> and so on with other conditions.
> >> >>
> >> >> TY for any help.
> >> >>
> >> >> Trading
> >> >>
> >> >> A2CT2 Ltd.
> >> >>



More information about the R-help mailing list