[R] Executing for loop by grouping variable within dataframe

ssobek at gwdg.de ssobek at gwdg.de
Wed Jul 27 19:38:57 CEST 2011


Dear list,

I have a large dataset which is structured as follows:

locality=c("USC00020958", "USC00020958", "USC00020958", "USC00020958",
"USC00020958", "USC00021001","USC00021001", "USC00021001", "USC00021001",
"USC00021001", "USC00021001")

temp.a=c(-1.2, -1.2, -1.2, -1.2, -1.1, -2.2, -2.4, -2.6,-2.7, -2.8, -3.0)

month= c(12, 12, 12, 12, 12, 11, 11, 11, 11, 11, 11)

day= c(27, 28, 29, 30, 31, 1,  2,  3,  4,  5,  6)

df=data.frame(locality,temp.a,month,day)

>      locality temp.a month day
>1  USC00020958   -1.2    12  27
>2  USC00020958   -1.2    12  28
>3  USC00020958   -1.2    12  29
>4  USC00020958   -1.2    12  30
>5  USC00020958   -1.1    12  31
>6  USC00021001   -2.2    11   1
>7  USC00021001   -2.4    11   2
>8  USC00021001   -2.6    11   3
>9  USC00021001   -2.7    11   4
>10 USC00021001   -2.8    11   5
>11 USC00021001   -3.0    11   6

I would like to calculate a 5th variable, temp.t, based on temp.a, and
temp.t for the preceding time step. I successfully created a for loop as
follows:

temp.t=list()

for(i in 2:nrow(df)){
k=0.8
temp.t[1]=df$temp.a[1]
temp.t[i]=(as.numeric(temp.t[i-1]))+k*(as.numeric(df$temp.a[i])-(as.numeric(temp.t[i-1])))
}

temp.t <- unlist(temp.t)


df["temp.t"] <- round(temp.t,1)

df

>     locality temp.a month day temp.t
>1  USC00020958   -1.2    12  27   -1.2
>2  USC00020958   -1.2    12  28   -1.2
>3  USC00020958   -1.2    12  29   -1.2
>4  USC00020958   -1.2    12  30   -1.2
>5  USC00020958   -1.1    12  31   -1.1
>6  USC00021001   -2.2    11   1   -2.0
>7  USC00021001   -2.4    11   2   -2.3
>8  USC00021001   -2.6    11   3   -2.5
>9  USC00021001   -2.7    11   4   -2.7
>10 USC00021001   -2.8    11   5   -2.8
>11 USC00021001   -3.0    11   6   -3.0

This worked fine as long as I was dealing with datasets that only
contained one locality. However, as you can see above, my current dataset
contains more than one locality, and I need to execute my loop for each
locality separately. What is the best approach to do this?

I have tried repeatedly to put the loop into a command using either ave,
by or tapply and to specify locality as the grouping variable, but no
matter what I try, nothing works, because I am unable to specify my loop
as a function within ave, by, or tapply.

I don't know if I am just doing it wrong (likely!) since I have no
experience working with loops/functions, or if this is simply not the
right approach to  solve my problem. I was also considering using a nested
for loop, but failed at setting it up. I would greatly appreciate if
someone could point me in the right direction.

Thanks a lot,

Stephanie



More information about the R-help mailing list