[R] glm() scale parameters and predicted Values
Ben Bolker
bbolker at gmail.com
Thu Jul 14 16:52:34 CEST 2011
Peter Maclean <pmaclean2011 <at> yahoo.com> writes:
>
> In glm() you can use the summary() function to recover
> the shape parameter (the reciprocal of the
> dispersion parameter). How do you recover the scale parameter?
> Also, in the given example, how I estimate
> and save the geometric mean of the predicted values?
> For a simple model you can use fitted() or predicted()
> functions. I will appreciate any help.
>
>
>
> #Call required R packages
> require(plyr)
> require(stats)
> require(fitdistrplus)
> require(MASS)
> #Grouped vector
> n <- c(1:10)
> yr <-c(1:10)
> ny <- list(yr=yr,n=n)
> require(utils)
> ny <- expand.grid(ny)
> y = rgamma(100, shape=1.5, rate = 1, scale = 2)
It's a bit odd to specify both the rate and scale parameters,
which are redundant. I would have guessed that the
rate parameter would dominate, but it looks like the
scale parameter dominates:
> set.seed(1001); rgamma(1,shape=1,rate=2)
[1] 1.622577
> set.seed(1001); rgamma(1,shape=1,scale=2)
[1] 6.490306
> set.seed(1001); rgamma(1,shape=1,rate=2,scale=2)
[1] 6.490306
> set.seed(1001); rgamma(1,shape=1,scale=2,rate=2)
[1] 6.490306
(I know, I could go look at the source, but it's
a .Internal() function and I'm feeling lazy ...)
> Gdata <- cbind(ny,y)
> Gdata2<- Gdata
> Gdata$x1 <- cos((3.14*yr)/365.25)
> Gdata$x2 <- sin((3.14*yr)/365.25)
> #Fitting Generalized Linear Models
> Gdata <- split(Gdata,Gdata$n)
> FGLM <- lapply(Gdata, function(x){
> m <- as.numeric(x$y)
> x1 <- m <- as.numeric(x$x1)
> x2 <- m <- as.numeric(x$x2)
> summary(glm(m~1+x1+x2, family=Gamma),dispersion=NULL)
> })
>
Note that you have simulated a null model (the data don't
depend on the covariates).
> #Save the results of the estimated parameters
> str(FGLM,no.list = TRUE)
> SFGLMC<- ldply(FGLM, function(x) x$coefficients)
> SFGLMD<- ldply(FGLM, function(x) x$dispersion)
> GLM <- cbind(SFGLMC,SFGLMD)
Which scale parameter do you mean? In a real
model it will vary with x1 and x2. Let's try an
experiment first:
> set.seed(1001)
> z <- rgamma(10000,shape=2,scale=3)
> g1 <- glm(z~1,family=Gamma)
> summary(g1)
Call:
glm(formula = z ~ 1, family = Gamma)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8434 -0.6579 -0.1702 0.3081 2.6220
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.167013 0.001189 140.5 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma family taken to be 0.5066566)
Null deviance: 5419.8 on 9999 degrees of freedom
Residual deviance: 5419.8 on 9999 degrees of freedom
AIC: 53526
Here the intercept estimate is 0.167 , which is very
nearly 1/6 = 1/(shape*scale) -- i.e. the Gamma GLM is
parameterized in terms of the mean (on the inverse
scale). If you want to recover
the scale parameter for the intercept case, then
summary(g1)$dispersion/coef(g1)[1]
should be pretty good.
As for the geometric means -- that's pretty tricky.
*If* you used a log link, then the predicted values on
the link scale (i.e. predict(g1,type="link")) would indeed
give you the geometric means. On the inverse scale, though,
you would have to do a bit of finagling.
More information about the R-help
mailing list