[R] what is the effective method to apply the below logic for ~1.2 million records in R
Ravi Teja
raviteja2504 at gmail.com
Sat Sep 19 23:09:45 CEST 2015
Hi,
I am trying to apply the below logic to generate flag_1 column on a data
set consisting of ~1.2 million records in R.
Code :
for(i in 1: nrows)
{
if(A$customer[i]==A$customer[i+1])
{
if(is.na(A$Time_Diff[i]))
A$flag_1[i] <- 1
else if (A$Time_Diff[i] > 12)
A$flag_1[i] <- 1
else
A$flag_1[i] <- A$flag_1[i-1]+1
}
else
{
if(is.na(A$Time_Diff[i]))
A$flag_1[i] <- 1
else if (A$Time_Diff[i] > 12)
A$flag_1[i] <- 1
else
A$flag_1[i] <- A$flag_1[i-1]+1
}
}
Resultant dataset should look like
Customer Time_diff flag_1
1 NA 1
1 10 2
1 8 3
1 15 1
1 9 2
1 10 3
2 NA 1
2 2 2
2 5 3
The above logic will take approximately 60 hours to generate the flag_1
column on a dataset consisting of ~1.2 million records. Is there any
effective way in R to implement this logic in R ?
Appreciate your help.
Thanks,
Ravi
[[alternative HTML version deleted]]
More information about the R-help
mailing list