[R] How to do it without for loops?

Thomas Lumley tlumley at u.washington.edu
Wed Mar 8 18:50:27 CET 2006


On Wed, 8 Mar 2006, Gabor Grothendieck wrote:

> Actually one problem with both of the solutions is determining
> that they are correct.  From that viewpoint, although not as
> elegant or fun, a more straightforward
> translation of the original question might actually be preferable

Yes. Although for the application where I encounter this (sandwich 
variance estimators), the proof that the two versions are the same is of 
independent interest, since it shows the relationship between the cluster 
jackknife, delta-betas, and the sandwich estimator.

 	-thomas



> (Thomas also already mentioned the computation in f):
>
> f <- function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2])))
> mm <-by(dat, dat$id, f)
> mm[[1]] + mm[[2]] + mm[[3]]
>
> or the last line could be replaced with these two if you don't know
> that there are necessarily three components:
>
> result <- mm[[1]]
> result[] <- do.call("mapply", c(sum, mm))
>
> One thing that this bought out for me is that R does not have
> a parallel to pmax for sum.  If one had psum of if "+" allowed
> more than 2 arguments one could have done this instead of the
> two lines involving result above:
>
>  do.call("psum", mm)  # if psum analog to pmax were available
>  do.call("+", mm)  # if + allowed > 2 arguments
>
>
> On 3/8/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
>> On Wed, 8 Mar 2006, Gabor Grothendieck wrote:
>>
>>> Thomas' solution is better but thought this might be of interest
>>> anyways since it can be written closer to mathematical notation.
>>> That is, the required expression can be written in the
>>> following equivalent way for a suitable matrix A:
>>>
>>> X' diag(u) A' A diag(u) X
>>
>> Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine
>> when n=20, or even 200,  but still...
>>
>>        -thomas
>>
>>
>>
>>> where diag(u) is a diagonal matrix with u along the diagonal
>>> as in the R diag function, spaces refer to matrix multiplication
>>> and ' means transpose.
>>>
>>> Thus we have:
>>>
>>> A <- outer(unique(dat$id), dat$id, "==")
>>> crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))
>>>
>>>
>>> On 3/8/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
>>>> On Wed, 8 Mar 2006, ronggui wrote:
>>>>
>>>>> Thank you for all .
>>>>>
>>>>> One more question.How can I calculate these efficiently?
>>>>>
>>>>> set.seed(100)
>>>>> dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
>>>>> # In this example,id's elements are  0,1,2.
>>>>> y<-list()
>>>>> for (i in 0:2){
>>>>> X<-as.matrix(subset(dat,id==i,c("x1","x2")))
>>>>> u<-as.matrix(subset(dat,id==i,c("u")))
>>>>> y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
>>>>> }
>>>>> y[[1]]+y[[2]]+y[[3]]
>>>>>
>>>>
>>>> People have already told you about crossprod, so crossprod(crossprod(X,u))
>>>> would seem an obvious improvement over the matrix multiplications.
>>>>
>>>> There is a better solution, though.
>>>>
>>>> Xu<-dat[,c("x1","x2")]*dat[,"u"]
>>>> crossprod( rowsum(Xu, dat$id))
>>>>
>>>>        -thomas
>>>>
>>>>
>>>>> the above code is not elegant.And my second solution to this problem
>>>>> is using by to get a list.
>>>>>
>>>>> matlis<-by(dat, dat$id, function(x){
>>>>> a<-as.matrix(x[,c("x1","x2")])
>>>>> b<-as.matrix(x[, "u"])
>>>>> t(a) %*% b  %*% t(b) %*% a
>>>>> })
>>>>>
>>>>> S <- matrix(unlist(matlis), 4, length(matlis))
>>>>> S1 <- matrix(rowSums(S), 2, 2)
>>>>>
>>>>> The code works ,but I want to ask if there is any other more better
>>>>> ways to do it? It seems that this kind of computation is quite common.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2006/2/28, Gabor Grothendieck <ggrothendieck at gmail.com>:
>>>>>> Try:
>>>>>>
>>>>>> crossprod(x)
>>>>>>
>>>>>> or
>>>>>>
>>>>>> t(x) %*% x
>>>>>>
>>>>>> On 2/28/06, ronggui <ronggui.huang at gmail.com> wrote:
>>>>>>> This is the code:
>>>>>>>
>>>>>>> x<-matrix(rnorm(20),5)
>>>>>>> y<-list()
>>>>>>> for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
>>>>>>> y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
>>>>>>>
>>>>>>> How can I do it without using for loops?
>>>>>>> Thank you in advance!
>>>>>>> --
>>>>>>> ronggui
>>>>>>> Deparment of Sociology
>>>>>>> Fudan University
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at stat.math.ethz.ch mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> »ÆÈÙ¹ó
>>>>> Deparment of Sociology
>>>>> Fudan University
>>>>>
>>>>>
>>>>
>>>> Thomas Lumley                   Assoc. Professor, Biostatistics
>>>> tlumley at u.washington.edu        University of Washington, Seattle
>>>>
>>>> ______________________________________________
>>>> R-help at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>
>>
>> Thomas Lumley                   Assoc. Professor, Biostatistics
>> tlumley at u.washington.edu        University of Washington, Seattle
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle


More information about the R-help mailing list