[R] Executing for loop by grouping variable within dataframe
Dennis Murphy
djmuser at gmail.com
Thu Jul 28 01:44:02 CEST 2011
Hi:
I don't get exactly the same results as you did in the second group
(how does temp.t[1] = -2.0 instead of -2.2?) but try this:
locality=c("USC00020958", "USC00020958", "USC00020958", "USC00020958",
"USC00020958", "USC00021001","USC00021001", "USC00021001", "USC00021001",
"USC00021001", "USC00021001")
temp.a=c(-1.2, -1.2, -1.2, -1.2, -1.1, -2.2, -2.4, -2.6,-2.7, -2.8, -3.0)
month= c(12, 12, 12, 12, 12, 11, 11, 11, 11, 11, 11)
day= c(27, 28, 29, 30, 31, 1, 2, 3, 4, 5, 6)
df=data.frame(locality, temp.a, month, day)
f <- function(d) {
k <- 0.8
if(nrow(d) == 1L) {return(data.frame(d, temp.t = temp.a))} else {
tmp <- rep(NA, nrow(d))
tmp[1] <- d[1, 'temp.a']
for(j in 2:length(tmp))
tmp[j] <- tmp[j - 1] + k * (d$temp.a[j] - tmp[j - 1])
data.frame(d, temp.t = tmp) }
}
require('plyr')
ddply(df, 'locality', f)
locality temp.a month day temp.t
1 USC00020958 -1.2 12 27 -1.200000
2 USC00020958 -1.2 12 28 -1.200000
3 USC00020958 -1.2 12 29 -1.200000
4 USC00020958 -1.2 12 30 -1.200000
5 USC00020958 -1.1 12 31 -1.120000
6 USC00021001 -2.2 11 1 -2.200000
7 USC00021001 -2.4 11 2 -2.360000
8 USC00021001 -2.6 11 3 -2.552000
9 USC00021001 -2.7 11 4 -2.670400
10 USC00021001 -2.8 11 5 -2.774080
11 USC00021001 -3.0 11 6 -2.954816
If you want to round the result, substitute the last line in the function with
data.frame(d, temp.t = round(tmp, 1))
Related functions are ceiling() and floor() in case they are of interest.
HTH,
Dennis
On Wed, Jul 27, 2011 at 10:38 AM, <ssobek at gwdg.de> wrote:
> Dear list,
>
> I have a large dataset which is structured as follows:
>
> locality=c("USC00020958", "USC00020958", "USC00020958", "USC00020958",
> "USC00020958", "USC00021001","USC00021001", "USC00021001", "USC00021001",
> "USC00021001", "USC00021001")
>
> temp.a=c(-1.2, -1.2, -1.2, -1.2, -1.1, -2.2, -2.4, -2.6,-2.7, -2.8, -3.0)
>
> month= c(12, 12, 12, 12, 12, 11, 11, 11, 11, 11, 11)
>
> day= c(27, 28, 29, 30, 31, 1, 2, 3, 4, 5, 6)
>
> df=data.frame(locality,temp.a,month,day)
>
>> locality temp.a month day
>>1 USC00020958 -1.2 12 27
>>2 USC00020958 -1.2 12 28
>>3 USC00020958 -1.2 12 29
>>4 USC00020958 -1.2 12 30
>>5 USC00020958 -1.1 12 31
>>6 USC00021001 -2.2 11 1
>>7 USC00021001 -2.4 11 2
>>8 USC00021001 -2.6 11 3
>>9 USC00021001 -2.7 11 4
>>10 USC00021001 -2.8 11 5
>>11 USC00021001 -3.0 11 6
>
> I would like to calculate a 5th variable, temp.t, based on temp.a, and
> temp.t for the preceding time step. I successfully created a for loop as
> follows:
>
> temp.t=list()
>
> for(i in 2:nrow(df)){
> k=0.8
> temp.t[1]=df$temp.a[1]
> temp.t[i]=(as.numeric(temp.t[i-1]))+k*(as.numeric(df$temp.a[i])-(as.numeric(temp.t[i-1])))
> }
>
> temp.t <- unlist(temp.t)
>
>
> df["temp.t"] <- round(temp.t,1)
>
> df
>
>> locality temp.a month day temp.t
>>1 USC00020958 -1.2 12 27 -1.2
>>2 USC00020958 -1.2 12 28 -1.2
>>3 USC00020958 -1.2 12 29 -1.2
>>4 USC00020958 -1.2 12 30 -1.2
>>5 USC00020958 -1.1 12 31 -1.1
>>6 USC00021001 -2.2 11 1 -2.0
>>7 USC00021001 -2.4 11 2 -2.3
>>8 USC00021001 -2.6 11 3 -2.5
>>9 USC00021001 -2.7 11 4 -2.7
>>10 USC00021001 -2.8 11 5 -2.8
>>11 USC00021001 -3.0 11 6 -3.0
>
> This worked fine as long as I was dealing with datasets that only
> contained one locality. However, as you can see above, my current dataset
> contains more than one locality, and I need to execute my loop for each
> locality separately. What is the best approach to do this?
>
> I have tried repeatedly to put the loop into a command using either ave,
> by or tapply and to specify locality as the grouping variable, but no
> matter what I try, nothing works, because I am unable to specify my loop
> as a function within ave, by, or tapply.
>
> I don't know if I am just doing it wrong (likely!) since I have no
> experience working with loops/functions, or if this is simply not the
> right approach to solve my problem. I was also considering using a nested
> for loop, but failed at setting it up. I would greatly appreciate if
> someone could point me in the right direction.
>
> Thanks a lot,
>
> Stephanie
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list