[R] How can I avoid the for and If loops in my function?
Richard M. Heiberger
rmh at temple.edu
Thu Jun 19 17:58:17 CEST 2014
Now that it is simplified, we see that you are incrementing the size
of an array at each step. This is
a very inefficient practice in R. For comparison,
## what you are doing
system.time({
tmp <- NULL
for (i in 1:10000) tmp <- rbind(tmp, 1:2)
})
## user system elapsed
## 0.622 0.297 1.256
## much faster
system.time({
tmp <- matrix(nrow=10000, ncol=2)
for (i in 1:10000) tmp[i,] <- 1:2
})
## user system elapsed
## 0.027 0.002 0.030
On Thu, Jun 19, 2014 at 11:04 AM, Mramba,Lazarus K <lmramba at ufl.edu> wrote:
> Hi Martin,
>
> Thanks for the useful comments. It has greatly improved the efficiency of my
> functions. I am sorry that I forgot to paste the functions varG and others
> which I have done here for reproducibility.
>
>
> #############################
> ## a function to calculate h2
> ## (heritability)
> ###############################
> herit<-function(varG,varR=1)
> {
> h<-4*varG/(varG+varR)
> h
> }
> ###################################
> # function to calculate random error
> varR<-function(varG,h2)
> {
> varR<- varG*(4-h2)/h2
> varR
> }
>
> ##########################################
> # function to calculate treatment variance
> varG<-function(varR=1,h2)
> {
> varG<-varR*h2/(4-h2)
> varG
> }
>
> ###############################
>
> I used the package "gmp" so that I can use the function "is.whole()".
>
> I have managed to reduce the complexity of the function that I needed help.
> It now takes half the time it used to take before, but it is several hours
> and wish I can cut these hours down by doing away with the for and if loops
> in this code.
> Any help will be appreciated.
>
> newfunF<- function(matrix0,n,traceI)
> {
> newmatdf<-matrix0
> trace<-traceI
> mat <- NULL
> Design_best <- newmatdf
> mat <- rbind(mat, c(value = trace, iterations = 0))
> Des<-list()
> newmatdf<-swapsimple(matrix0)
> for(i in 2:n){
> newmatdf<-swapsimple(newmatdf)
>
> Des[[i]]<-newmatdf
> if(swapmainF(newmatdf) < trace){
> Design_best <- Des[[i]]<-newmatdf
> trace<- swapmainF(newmatdf)
> mat <- rbind(mat, c(trace = trace, iterations = i))
> }
> }
> list(mat=mat,Design_best=Design_best)
>
> }
>
>
>
>
>
> On Thu, 19 Jun 2014 16:50:15 +0200, Martin Maechler wrote:
>>>>>>>
>>>>>>> lmramba <lmramba at ufl.edu>
>>>>>>> on Wed, 18 Jun 2014 20:00:15 +0000 writes:
>>
>>
>> > Hi Jim. If I avoid the dataframe, how can use the function
>> model.matrix() to
>> > build the incident matrices X, And Z? I tried saving the design
>> as matrix
>> > but ghen I got the wrong design matrix.
>>
>> I think you are entirely right here, Laz.
>> That indeed you have data frame and a formula --> model.matrix()
>> to get the matrix.
>>
>> I have no time currently to delve into your
>> example, and I see
>> - it is not reproducible {you use a function
>> varG() that is undefined}
>> - you use foreach just so you can use %do% in one place which I
>> think makes no sense
>> - you use package 'gmp' which I don't think you'd use, but I
>> don't know as your code is not reproducible ....
>> - you use "<<-" in quite a few places in your code, which is
>> considered really bad programming style and makes it very hard
>> to understand the code by reading it ...
>>
>> ... *but* .. after all that ...
>> ...
>> as maintainer of the Matrix package
>> I'm close to absolutely sure that you want to work with *sparse*
>> matrices as Matrix provides.
>>
>> So in fact, do use
>>
>> ## "require": just so you are not tempted to call a package a "library"
>> require(Matrix)
>>
>> help(sparse.model.matrix)
>>
>> and then do
>> - use sparse.model.matrix() instead of model.matrix().
>>
>> Further, do
>> - use Diagonal() instead of diag() for *constructing* diagonal matrices.
>>
>> Please let us know if this helps
>>
>> [and maybe fix your example to become reproducible: do
>>
>> rm(list=ls(all=TRUE))
>>
>> before source(...) ing the reproducible example script...
>> ]
>>
>> Martin Maechler, ETH Zurich
>>
>>
>> > Thanks.
>>
>> > Laz
>>
>>
>> > Sent from my LG Optimus G™, an AT&T 4G LTE smartphone
>> > ------ Original message ------
>> > From: jim holtman
>> > Date: 6/18/2014 3:49 PM
>> > To: Laz;
>> > Cc: R mailing list;
>> > Subject:Re: [R] How can I avoid the for and If loops in my function?
>>
>> > First order of business, without looking in detail at the code,
>> is to avoid
>> > the use of dataframes. If all your values are numerics, then
>> use a matrix.
>> > It will be faster execution.
>> > I did see the following statements:
>> > newmatdf<-Des[[i]]
>> > Des[[i]]<-newmatdf
>> > why are you just putting back what you pulled out of the list?
>>
>> > Jim Holtman
>> > Data Munger Guru
>>
>> > What is the problem that you are trying to solve?
>> > Tell me what you want to do, not how you want to do it.
>> > On Wed, Jun 18, 2014 at 12:41 PM, Laz <[1]lmramba at ufl.edu> wrote:
>>
>> > Dear R-users,
>> > I have a 3200 by 3200 matrix that was build from a data frame that
>> had
>> > 180 observations, with variables: x, y, blocks (6 blocks) and
>> > treatments (values range from 1 to 180) I am working on. I build
>> other
>> > functions that seem to work well. However, I have one function that
>> has
>> > many If loops and a long For loop that delays my results for over 10
>> > hours ! I need your help to avoid these loops.
>> > ########################################################
>> > ## I need to avoid these for loops and if loops here :
>> > ########################################################
>> > ### swapsimple() is a function that takes in a dataframe,
>> randomly swaps
>> > two elements from the same block in a data frame and generates a new
>> > dataframe called newmatdf
>> > ### swapmainF() is a function that calculates the trace of the final
>> N
>> > by N matrix considering the incident matrices and blocks and
>> treatments
>> > and residual errors in a linear mixed model framework using
>> Henderson
>> > approach.
>> > funF<- function(newmatdf, n, traceI)
>> > {
>> > # n = number of iterations (swaps to be made on pairs of
>> elements of the
>> > dataframe, called newmatdf)
>> > # newmatdf : is the original dataframe with N rows, and 4 variables
>> > (x,y,blocks,genotypes)
>> > matrix0<-newmatdf
>> > trace<-traceI ## sum of the diagonal elements of the N by N matrix
>> > (generated outside this loop) from the original newmatdf dataframe
>> > res <- list(mat = NULL, Design_best = newmatdf, Original_design =
>> > matrix0) # store our output of interest
>> > res$mat <- rbind(res$mat, c(value = trace, iterations = 0)) #
>> > initialized values
>> > Des<-list()
>> > for(i in seq_len(n)){
>> > ifelse(i==1,
>> > newmatdf<-swapsimple(matrix0),newmatdf<-swapsimple(newmatdf))
>> > Des[[i]]<-newmatdf
>> > if(swapmainF(newmatdf) < trace){
>> > newmatdf<-Des[[i]]
>> > Des[[i]]<-newmatdf
>> > trace<- swapmainF(newmatdf)
>> > res$mat <- rbind(res$mat, c(trace = trace, iterations = i))
>> > res$Design_best <- newmatdf
>> > }
>> > if(swapmainF(newmatdf) > trace & nrow(res$mat)<=1){
>> > newmatdf<-matrix0
>> > Des[[i]]<-matrix0
>> > res$Design_best<-matrix0
>> > }
>> > if(swapmainF(newmatdf)> trace & nrow(res$mat)>1){
>> > newmatdf<-Des[[length(Des)-1]]
>> > Des[[i]]<-newmatdf
>> > res$Design_best<-newmatdf
>> > }
>> > }
>> > res
>> > }
>> > The above function was created to:
>> > Take an original matrix, called matrix0, calculate its trace.
>> > Generate a new matrix, called newmatdf after swapping two
>> elements of the
>> > old one and calculate the trace. If the trace of the newmatrix is
>> > smaller than
>> > that of the previous matrix, store both the current trace together
>> > with the older trace and their iteration values. If the newer
>> matrix has
>> > a trace larger than the previous trace, drop this trace and drop
>> this
>> > matrix too (but count its iteration).
>> > Re-swap the old matrix that you stored previously and recalculate
>> the
>> > trace. Repeat the
>> > process many times, say 10,000. The final results should be a list
>> > with the original initial matrix and its trace, the final best
>> > matrix that had the smallest trace after the 10000 simulations and a
>> > dataframe showing the values of the accepted traces that
>> > were smaller than the previous and their respective iterations.
>> > $Original_design
>> > x y block genotypes
>> > 1 1 1 1 29
>> > 7 1 2 1 2
>> > 13 1 3 1 8
>> > 19 1 4 1 10
>> > 25 1 5 1 9
>> > 31 1 6 2 29
>> > 37 1 7 2 4
>> > 43 1 8 2 22
>> > 49 1 9 2 3
>> > 55 1 10 2 26
>> > 61 1 11 3 18
>> > 67 1 12 3 19
>> > 73 1 13 3 28
>> > 79 1 14 3 10
>> > ------truncated ----
>> > the final results after running funF<-
>> > function(newmatdf,n,traceI) given below looks like this:
>> > ans1
>> > $mat
>> > value iterations
>> > [1,] 1.474952 0
>> > [2,] 1.474748 1
>> > [3,] 1.474590 2
>> > [4,] 1.474473 3
>> > [5,] 1.474411 5
>> > [6,] 1.474294 10
>> > [7,] 1.474182 16
>> > [8,] 1.474058 17
>> > [9,] 1.473998 19
>> > [10,] 1.473993 22
>> > ---truncated
>> > $Design_best
>> > x y block genotypes
>> > 1 1 1 1 29
>> > 7 1 2 1 2
>> > 13 1 3 1 18
>> > 19 1 4 1 10
>> > 25 1 5 1 9
>> > 31 1 6 2 29
>> > 37 1 7 2 21
>> > 43 1 8 2 6
>> > 49 1 9 2 3
>> > 55 1 10 2 26
>> > ---- truncated
>> > $Original_design
>> > x y block genotypes
>> > 1 1 1 1 29
>> > 7 1 2 1 2
>> > 13 1 3 1 8
>> > 19 1 4 1 10
>> > 25 1 5 1 9
>> > 31 1 6 2 29
>> > 37 1 7 2 4
>> > 43 1 8 2 22
>> > 49 1 9 2 3
>> > 55 1 10 2 26
>> > 61 1 11 3 18
>> > 67 1 12 3 19
>> > 73 1 13 3 28
>> > 79 1 14 3 10
>> > ------truncated
>> > Regards,
>> > Laz
>> > [[alternative HTML version deleted]]
>> > ______________________________________________
>> > [2]R-help at r-project.org mailing list
>> > [3]https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting
>> > guide [4]http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> > References
>>
>> > 1. mailto:lmramba at ufl.edu
>> > 2. mailto:R-help at r-project.org
>> > 3. https://stat.ethz.ch/mailman/listinfo/r-help
>> > 4. http://www.R-project.org/posting-guide.html
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list