[R] avoiding timconsuming for loop renaming identifiers
toby909 at gmail.com
toby909 at gmail.com
Sat Jul 21 03:26:07 CEST 2007
Hi All
I was wondering if I can avoid a time-consuming for loop on my 600000 obs dataset.
school_id y
8 9.87
8 8.89
8 7.89
8 8.88
20 6.78
20 9.99
20 8.79
31 10.1
31 11
There are, say, 143 different schools in this 600000 obs dataset.
I need to thave sequential identifiers, 1,2,3,4,5,...,143.
I was using an awkward for look that took 30 minutes to run.
sid = 1
dta$sid[1] = 1
for (i in 2:nrow(dta)) {
if (dta$school_id[i] != dta$school_[i-1]) sid = sid+1
dta$sid[i] = sid
}
Any hints appreciated.
Thanks Toby
More information about the R-help
mailing list