[R] help for efficient loop
Takatsugu Kobayashi
tkobayas at indiana.edu
Thu Mar 8 05:22:07 CET 2007
Hi,
I have been trying to minimize computation times in the following loops.
I could successfully use lapply to minimize a lot simpler stuff. So I am
trying to use lapply or sapply to minimize computing times again.
The whole purpose is to create the X and Y coordinates using normal
distribution and compute local standard distances. Local by which I mean
is that a group of observation points are selected by distance
thresholds of the reference points.
# Normally distributed X-Y Coordinates with hypothetical z values
pts<-500 # Number of observations =n
cases<-10 # Number of variables
x<-rnorm(pts)
y<-rnorm(pts)
z<-matrix(abs(rnorm(pts*cases)),pts,cases)
# Combine x, y, and zs
Ldata<-cbind(x,y,z) # n*(2+p) matrix p=# of variables 2=X and Y
# Compute the Euclidean distances between points
disE<-data.matrix(dist(cbind(x,y)))
# Create a series of values that act as a threshold
thrsE<-seq(1,max(disE),by=0.5)
# Compute local mean centers and median centers of the nearest neighbors
within the distance threshold of n reference points
LMNX<-matrix(,pts,length(thrsE)) # local mean X
LMNY<-matrix(,pts,length(thrsE)) # local mean Y
LMDX<-matrix(,pts,length(thrsE)) # local median X
LMDY<-matrix(,pts,length(thrsE)) # local median Y
LSDMN<-rep(list(matrix(,pts,length(thrsE))),cases)
# Then compute standard distances of the Zs of the neighbors within the
distance thresholds of n reference points
for (j in 1:pts){
for (k in 1:length(thrsE)){
LMNX[j,k]<-mean(Ldata[as.vector(which(disE[j,]<=thrsE[k])),1])
LMNY[j,k]<-mean(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2])
LMDX[j,k]<-median(Ldata[as.vector(which(disE[j,]<=thrsE[k])),1])
LMDY[j,k]<-median(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2])
for (l in 1:cases){
LSDMN[[l]][j,k]<-sqrt(sum(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2+l]*(Ldata[as.vector(which(disE[j,]<=thrsE[k])),1]-LMNX[j,k])^2+
Ldata[as.vector(which(disE[j,]<=thrsE[k])),2+l]*(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2]-LMNY[j,k])^2)/sum(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2+l]))
}}}
I believe I should use lapply or sapply in this loop to minimize
computing times because my way is to allocate computed values at [j,k]
of the big matrix.... I have tried using lapply, but I am not sure how I
can define higher arrays that work with lapply...
many many thanks in advance.
Taka
Indiana University
More information about the R-help
mailing list