[R] Improvement in Process time
Amelia Marsh
amelia_marsh08 at yahoo.com
Tue Feb 2 13:03:02 CET 2016
Dear R forum,
I am running a Particular process 1000 times for different rates. Each time the result of the process is getting stored (appended) in a data.frame. However, the process is taking unsual time at times more than 2 hours. When I had tried to find out the reason for such a long process time, I have realized that writing a data.frame is consuming lot of time.
Here is an extract of my code
# ---------------------------------------------------------------
tx_discounted <- read.csv('transaction_discounted.csv', na.strings='')
tx_discounted$id <- as.character(tx_discounted$id)
n <- max(unique(simulated_exchange$id))
result <- NULL
current <- 1
rcount <- 0
current1 <- 1
rcount1 <- 0
current2 <- 1
rcount2 <- 0
for (env in 0:n) {
if (rcount == 0) rcount <- nrow(subset(simulated_interest, id==env))
temp <- current+rcount-1
env_rates <- simulated_interest[current:temp,]
env_rates <- env_rates[order(env_rates$curve, env_rates$day_count), ]
if (rcount1 == 0) rcount1 <- nrow(subset(simulated_exchange, id==env))
temp <- current1+rcount1-1
exch_rates <- simulated_exchange[current1:temp,]
if (rcount2 == 0) rcount2 <- nrow(subset(simulated_instruments, id==env))
temp <- current2+rcount2-1
instr_rates<- simulated_instruments[current2:temp,]
current <- current+rcount
current1 <- current1+rcount1
current2 <- current2+rcount2
curve <- daply(env_rates, 'curve', function(x) {
return(approxfun(x$day_count, x$rate, rule = 2))
})
# ____________________________________________________
## Actual time consumtion begins from following part
# ____________________________________________________
result <- rbind(result, ddply(tx_discounted, 'id', function(x) {
if(!is.na(x$curve) && x$curve != '') {
intrate <- curve[[x$curve]](x$maturity_period)
} else {
intrate <- subset(instr_rates, instrument==as.character(x$instrument))$value
}
cross_rate <- subset(exch_rates, key==paste(x$currency, x$currency_base, sep='_'))$rate
mtm_bc <- cross_rate * (x$amount/(1+((intrate/100)*(x$maturity_period/x$intbasis))))
return(data.frame(env=env, id=x$id, instrument=x$instrument, currency=x$currency,
intrate = intrate, maturity_period = x$maturity_period, intbasis = x$intbasis, cross_rate = cross_rate, amount=x$amount, mtm_bc=mtm_bc))
}))
}
# ---------------------------------------------------------------------------
Unfortuantely I can't share the input files. Is there any way I can improve the process time.
Regards and thanking in advance
Amelia
More information about the R-help
mailing list