[R] data frame manipulation with condition
Arnaud Gaboury
arnaud.gaboury at a2ct2.com
Fri Feb 24 18:52:55 CET 2012
In fact I need to use William tip: Use mult[as.character(df$x)] instead of mult[df$x].
Let's try again with a shorter df as example:
The rule: if AA, then multiply y by 2, if BB multiply y by 5, if CC do nothing, if DD multiply by 2.
Let's say day 1 I have df1:
df1 <-
structure(list(x = structure(c(1L, 2L, 2L, 3L), .Label = c("AA",
"BB", "CC"), class = "factor"), y = 1:4), .Names = c("x", "y"
), row.names = c(NA, -4L), class = "data.frame")
> df1
x y
1 AA 1
2 BB 2
3 BB 3
4 CC 4
>mult <- c("AA"=2,"BB"=5,"CC"=1,"DD"=2)
>df1$y <- df1$y * mult[as.character(df1$x)]
> df1
x y
1 AA 2
2 BB 10
3 BB 15
4 CC 4
WORKING
Now day 2 with df2:
>df2 <- data.frame(x = c("AA","AA","BB","BB","BB","CC","DD","DD"), y = 1:8)
>df2$y <- df2$y * mult[as.character(df2$x)]
> df2
x y
1 AA 2
2 AA 4
3 BB 15
4 BB 20
5 BB 25
6 CC 6
7 DD 14
8 DD 16
WORKING
Ty both of you and have a good weekend.
Arnaud Gaboury
A2CT2 Ltd.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Arnaud Gaboury
Sent: vendredi 24 février 2012 18:17
To: Sarah Goslee
Cc: r-help at r-project.org
Subject: Re: [R] data frame manipulation with condition
TY very much Sarah: your tip is doing the job:
reported <-
structure(list(Product = structure(c(1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 5L, 5L, 5L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 13L, 14L, 14L), .Label = c("CL", "Cocoa", "Coffee C", "GC", "HG", "HO", "NG", "PL", "RB", "SI", "Sugar No 11", "ZC", "ZL", "ZW"), class = "factor"), reported.Price = c(105.35, 2380, 2407, 2408, 202.35, 202.8, 202.95, 205.85, 206.05, 206.1, 206.2, 1748, 378.8, 379.25, 379.5, 320.61, 2.538, 2.543, 1669, 1678.5, 304.49, 321.39, 321.6, 321.65, 322.5, 322.55, 322.8, 323.04, 3390, 3397.5, 24.16, 24.2, 24.22, 24.23, 24.54, 25.5, 25.55, 631.75, 638, 53.77, 630.75, 633), reported.Nbr.Lots = c(6L, 3L, -1L, -2L, -40L, -1L, -1L, 10L, 5L, 6L, 19L, 17L, 23L, 12L, 35L, 11L, -54L, -52L, 26L, 26L, 10L, -10L, 1L, 4L, 4L, 1L, 5L, 5L, 17L, 17L, 114L, 71L, 16L, 27L, -3L, 3L, -3L, -89L, -1L, -1L, -51L, -51L)), .Names = c("Product", "reported.Price", "reported.Nbr.Lots"
), row.names = c(7L, 4L, 5L, 6L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 8L, 9L, 10L, 11L, 12L, 20L, 21L, 22L, 23L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 31L, 32L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 2L, 3L, 1L, 33L, 34L), class = "data.frame")
> mult<-c(CL=100,GC=10,HG=10,NG=1000,PL=10,RB=100,SI=10,ZL=100,HO=100,KC
> =1,CC=1,SB=1,ZC=1,ZW=1) reported$reported.Price <-
> reported$reported.Price * mult[reported$Product]
reported <-
structure(list(Product = structure(c(1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 5L, 5L, 5L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 13L, 14L, 14L), .Label = c("CL", "Cocoa", "Coffee C", "GC", "HG", "HO", "NG", "PL", "RB", "SI", "Sugar No 11", "ZC", "ZL", "ZW"), class = "factor"), reported.Price = c(10535, 23800, 24070, 24080, 2023.5, 2028, 2029.5, 2058.5, 2060.5, 2061, 2062, 1748000, 3788, 3792.5, 3795, 32061, 25.38, 25.43, 166900, 167850, 30449, 32139, 32160, 32165, 32250, 32255, 32280, 32304, 3390, 3397.5, 24.16, 24.2, 24.22, 24.23, 24.54, 25.5, 25.55, 631.75, 638, 53.77, 630.75, 633), reported.Nbr.Lots = c(6L, 3L, -1L, -2L, -40L, -1L, -1L, 10L, 5L, 6L, 19L, 17L, 23L, 12L, 35L, 11L, -54L, -52L, 26L, 26L, 10L, -10L, 1L, 4L, 4L, 1L, 5L, 5L, 17L, 17L, 114L, 71L, 16L, 27L, -3L, 3L, -3L, -89L, -1L, -1L, -51L, -51L)), .Names = c("Product", "reported.Price", "reported.Nbr.Lots"
), row.names = c(7L, 4L, 5L, 6L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 8L, 9L, 10L, 11L, 12L, 20L, 21L, 22L, 23L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 31L, 32L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 2L, 3L, 1L, 33L, 34L), class = "data.frame")
Have a good weekend.
Arnaud Gaboury
A2CT2 Ltd.
Trade: +41 22 849 88 63
Fax: +41 22 849 88 66
arnaud.gaboury at a2ct2.com
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you have received this email in error please notify the sender.
-----Original Message-----
From: Sarah Goslee [mailto:sarah.goslee at gmail.com]
Sent: vendredi 24 février 2012 17:54
To: Arnaud Gaboury
Cc: r-help at r-project.org
Subject: Re: [R] data frame manipulation with condition
You need, as I already suggested, to use a value of 1 for levels you don't want to change.
> mult <- c(AA = 10, BB = 25, CC=1, DD=15) mult[df$x]
AA BB CC AA DD DD
10 25 1 10 15 15
> df$y * mult[df$x]
AA BB CC AA DD DD
10 50 3 40 75 90
On Fri, Feb 24, 2012 at 11:36 AM, Arnaud Gaboury <arnaud.gaboury at a2ct2.com> wrote:
>> df<- data.frame(x = c("AA","BB","CC","AA","DD","DD"), y = 1:6) mult
>> <- c(AA = 10, BB = 25,DD=15) df$y <- df$y * mult[df$x] df
> x y
> 1 AA 10
> 2 BB 50
> 3 CC 45
> 4 AA 40
> 5 DD NA
> 6 DD NA
>
> My df is in fact much more longer than the chosen example shown here. It seems your tip didn't do the job.
> I am expecting this as result :
>
>> df
> x y
> 1 AA 10 ----> if df$x==AA, df$y<-1*10
> 2 BB 50 ----> if df$x==BB, df$y<-2*25
> 3 CC 3 NOTHING
> 4 AA 40 ----> if df$x==AA, df$y<-4*10
> 5 DD 75 ----> if df$x==DD, df$y<-5*15
> 6 DD 90 ----> if df$x==DD, df$y<-6*15
>
> Arnaud Gaboury
>
> A2CT2 Ltd.
>
> -----Original Message-----
> From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de]
> Sent: vendredi 24 février 2012 17:07
> To: Arnaud Gaboury
> Cc: r-help at r-project.org
> Subject: Re: [R] data frame manipulation with condition
>
>
>
> On 24.02.2012 16:59, Arnaud Gaboury wrote:
>> TY Uwe,
>>
>> So I will have to write a line for each condition? Right?
>>
>> In fact I was trying to do something with apply in one line, but couldn't achieve any result. In fact, all my transformation will be multiplying one object by a specific number according to the value of df$x.
>
> In that case:
>
> mult <- c(AA = 10, BB = 25)
>
> Then:
>
>
> df$y <- df$y * mult[df$x]
>
>
> Uwe Ligges
>
>
>>
>> Arnaud Gaboury
>>
>> A2CT2 Ltd.
>>
>>
>> -----Original Message-----
>> From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de]
>> Sent: vendredi 24 février 2012 16:33
>> To: Arnaud Gaboury
>> Cc: r-help at r-project.org
>> Subject: Re: [R] data frame manipulation with condition
>>
>>
>>
>> On 24.02.2012 16:25, Arnaud Gaboury wrote:
>>> Dear list,
>>>
>>> n00b question, but still can't find any easy answer.
>>>
>>> Here is a df:
>>
>>
>> Change
>>
>>>> df<-data.frame(cbind(x=c("AA","BB","CC","AA"),y=1:4))
>>
>> to
>>
>> df<- data.frame(x = c("AA","BB","CC","AA"), y = 1:4)
>>
>> to make your object a sensible data.frame.
>>
>>
>>
>>>> df
>>> x y
>>> 1 AA 1
>>> 2 BB 2
>>> 3 CC 3
>>> 4 AA 4
>>>
>>>
>>> I want to modify this df this way :
>>> if df$x=="AA" then df$y=df$y*10
>>
>> df$y[df$x=="AA"]<- df$y[df$x=="AA"] * 25
>>
>> ...
>>
>>
>> Uwe Ligges
>>
>>
>>> if df$x=="BB" then df$y=df$y*25
>>
>>
--
Sarah Goslee
http://www.functionaldiversity.org
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list