[R] How to speed up interpolation
James Rome
jamesrome at gmail.com
Sun Jul 17 19:30:21 CEST 2011
df is a very large data frame with arrival estimates for many flights
(DF$flightfact) at random times (df$PredTime). The error of the estimate
is df$dt.
My problem is that I want to know the prediction error at each minute
before landing. This code works, but is very slow, and dominates
everything. I tried using split(), but that rapidly ate up my 12 GB of
memory. So, is there a better R way of doing this?
Thanks,
Jim Rome
flights = table(df$flightfact[1:dim(df)[1], drop=TRUE])
nflights = length(flights)
flights = as.data.frame(flights)
times = data.frame()
# Split by flight
for(i in 1:nflights) {
tf = df[as.numeric(df$flightfact)==flights[i,1],] # This flight
#check for at least 2 entries
if(dim(tf)[1] < 2) {
next
}
idf = interpolateTimes(tf)
times = rbind(times, idf)
}
# Interpolate the times to every minute for 60 minutes
# Return a new data frame
interpolateTimes = function(df) {
x = as.numeric(seq(from=0,to=60)) # The times to interpolate to
dti = approx(as.numeric(df$PredTime), as.numeric(df$dt), x,
method="linear",rule=1:1)
# Make a new data frame of interpolated values
idf = data.frame(time=dti$x, error=dti$y,
runway=rep(df$lrw[1],length(dti$x)),
flight=rep(df$flightfact[1], length(dti$x)))
return(idf)
}
More information about the R-help
mailing list