[R] Regression performance when using summary() twice

c@buhtz m@iii@g oii posteo@jp c@buhtz m@iii@g oii posteo@jp
Fri Jun 21 16:38:58 CEST 2024


Hello,

I am not a regular R user but coming from Python. But I use R for 
several special task.

Doing a regression analysis does cost some compute time. But I wonder 
when this big time consuming algorithm is executed and if it is done 
twice in my sepcial case.

It seems that calling "glm()" or similar does not execute the time 
consuming part of the regression code.
It seems it is done when calling "summary(model)".
Am I right so far?

If this is correct I would say that in my case the regression is down 
twice with the identical formula and data. Which of course is 
inefficient. See this code:

my_function <- function(formula_string, data) {
             formula <- as.formula(formula_string)
             model <- glm.nb(formula, data = data)

             result = cbind(summary(model)$coefficients, confint(model))
             result = as.data.frame(result)

             string_result = capture.output(summary(model))

             return(list(result, string_result))
         }

I do call summary() once to get the "$coefficents" and a second time 
when capturing its output as a string.

If this really result in computing the regression twice I ask myself if 
there is a R-way to make this more efficent?

Best regards,
Christian Buhtz



More information about the R-help mailing list