[R] Ranking within factor subgroups
maneesh deshpande
dmaneesh at hotmail.com
Wed Feb 22 03:44:47 CET 2006
Hi,
I have a dataframe, x of the following form:
Date Symbol A B C
20041201 ABC 10 12 15
20041201 DEF 9 5 4
...
20050101 ABC 5 3 1
20050101 GHM 12 4 2
....
here A, B,C are properties of a set symbols recorded for a given date.
I wante to decile the symbols For each date and property and
create another set of columns "bucketA","bucketB", "bucketC" containing the
decile rank
for each symbol. The following non-vectorized code does what I want,
bucket <- function(data,nBuckets) {
q <- quantile(data,seq(0,1,len=nBuckets+1),na.rm=T)
q[1] <- q[1] - 0.1 # need to do this to ensure there are no extra NAs
cut(data,q,include.lowest=T,labels=F)
}
calcDeciles <- function(x,colNames) {
nBuckets <- 10
dates <- unique(x$Date)
for ( date in dates) {
iVec <- x$Date == date
xx <- x[iVec,]
for (colName in colNames) {
data <- xx[,colName]
bColName <- paste("bucket",colName,sep="")
x[iVec,bColName] <- bucket(data,nBuckets)
}
}
x
}
x <- calcDeciles(x,c("A","B","C"))
I was wondering if it is possible to vectorize the above function to make it
more efficient.
I tried,
rlist <- tapply(x$A,x$Date,bucket)
but I am not sure how to assign the contents of "rlist" to their appropriate
slots in the original
dataframe.
Thanks,
Maneesh
More information about the R-help
mailing list