[R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Aug 29 20:36:52 CEST 2014
On 29/08/2014 17:46, Nick Livingston wrote:
> Thank you for your responses.
>
> Since my previous attempt to manually truncate my DV didn't work, I'm very interested in trying again using the zerotrun() function. However, I attempted to install "countreg" but received the following notification:
>
> Warning in install.packages :
> unable to access index for repository http://R-Forge.R-project.org/bin/macosx/contrib/3.0
>
> package ‘countreg’ is available as a source package but not as a binary
>
> Warning in install.packages :
> package ‘countreg’ is not available (for R version 3.0.3)
>
> I received the same message when attempting to install it in version 3.1.0, and the latest version, 3.1.1. Am I missing something?
As the package does not contain compiled code, you can simply install
from the sources. (I just did so in the GUI: just untick 'binary'.)
> Thank you again. I appreciate your input.
>
> -Nick
> --------------------------------------------
> On Fri, 8/29/14, Achim Zeileis <Achim.Zeileis at uibk.ac.at> wrote:
>
> Subject: Re: [R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"
> To: "peter dalgaard" <pdalgd at gmail.com>
> Cc: "Nick Livingston" <nlivingston at ymail.com>, r-help at r-project.org
> Date: Friday, August 29, 2014, 5:26 AM
>
> On Fri, 29 Aug 2014,
> peter dalgaard wrote:
>
> >
> I'm no expert on hurdle models, but it seems that you
> are unaware that
> > the negative binomial
> and the truncated negative binomial are quite
> > different things.
>
> Yes. You can replicate the truncated count part
> of the hurdle model with
> the zerotrunc()
> function from the "countreg" package. The package
> is not
> yet on CRAN but can be easily
> installed from R-Forge.
>
> > -pd
> >
> >
> > On 29 Aug 2014, at
> 05:57 , Nick Livingston <nlivingston at ymail.com>
> wrote:
> >
> >> I have
> sought consultation online and in person, to no avail. I
> hope someone
> >> on here might have
> some insight. Any feedback would be most welcome.
> >>
> >> I am
> attempting to plot predicted values from a two-component
> hurdle model
> >> (logistic [suicide
> attempt yes/no] and negative binomial count [number of
> >> attempts thereafter]). To do so, I
> estimated each component separately using
> >> glm (MASS). While I am able to
> reproduce hurdle results for the logit
> >> portion in glm, estimates for the
> negative binomial count component are
> >> different.
> >>
> >> Call:
> >>
> hurdle(formula = Suicide. ~ Age + gender + Victimization *
> FamilySupport |
> >> Age + gender +
> Victimization * FamilySupport, dist = "negbin",
> link =
> >> "logit")
> >>
> >> Pearson
> residuals:
> >> Min
> 1Q Median 3Q Max
> >> -0.9816 -0.5187 -0.4094 0.2974
> 5.8820
> >>
> >>
> Count model coefficients (truncated negbin with log
> link):
> >>
>
> Estimate Std. Error z value
> >> Pr(>|z|)
> >>
> (Intercept) -0.29150
> 0.33127 -0.880 0.3789
> >> Age
> 0.17068
> 0.07556 2.259 0.0239
> >> *
> >> gender
>
> 0.28273
> 0.31614 0.894 0.3712
> >> Victimization
> 1.08405
> 0.18157 5.971 2.36e-09
> >>
> ***
> >> FamilySupport
> 0.33629
> 0.29302 1.148 0.2511
> >> Victimization:FamilySupport -0.96831
> 0.46841 -2.067 0.0387 *
> >> Log(theta)
> 0.12245
> 0.54102 0.226 0.8209
> >> Zero hurdle model coefficients
> (binomial with logit link):
> >>
>
> Estimate Std. Error z value
> >>
> Pr(>|z|)
> >> (Intercept)
>
> -0.547051 0.215981 -2.533
> 0.01131
> >> *
> >>
> Age
> -0.154493 0.063994 -2.414
> >> 0.01577 *
> >>
> gender
> -0.030942 0.284868 -0.109
> 0.91350
> >> Victimization
>
> 1.073956 0.338015 3.177
> 0.00149
> >> **
> >>
> FamilySupport
> -0.380360 0.247530 -1.537
> 0.12439
> >>
> Victimization\:FamilySupport
> -0.813329 0.399905 -2.034 0.04197 *
> >> ---
> >> Signif.
> codes: 0 '***' 0.001 '**' 0.01 '*'
> 0.05 '.' 0.1 ' ' 1
> >>
> >> Theta: count
> = 1.1303
> >> Number of iterations in
> BFGS optimization: 23
> >>
> Log-likelihood: -374.3 on 25 Df
> >>>
> summary(logistic)
> >>
> >>
> >>
> >>
> >> Call:
> >> glm(formula = SuicideBinary ~ Age +
> gender = Victimization * FamilySupport,
> >> family = "binomial")
> >>
> >> Deviance
> Residuals:
> >> Min
> 1Q Median 3Q
> Max
> >> -1.9948 -0.8470
> -0.6686 1.1160 2.0805
> >>
> >>
> Coefficients:
> >>
>
> Estimate Std. Error z value
> >>
> Pr(>|z|)
> >> (Intercept)
> -0.547051 0.215981
> -2.533 0.01131 *
> >> Age
>
> -0.154493 0.063994 -2.414 0.01577
> >> *
> >> gender
>
> -0.030942 0.284868 -0.109 0.91350
> >> Victimization
>
> 1.073956 0.338014 3.177
> 0.00149
> >> **
> >>
> FamilySupport
> -0.380360 0.247530 -1.537 0.12439
> >> Victimization:FamilySupport
> -0.813329 0.399904 -2.034 0.04197 *
> >> ---
> >> Signif.
> codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> >>
> >> (Dispersion
> parameter for binomial family taken to be 1)
> >>
> >>
> Null deviance: 452.54 on 359 degrees of
> freedom
> >> Residual deviance: 408.24
> on 348 degrees of freedom
> >> (52 observations deleted
> due to missingness)
> >> AIC: 432.24
> >>
> >> Number of
> Fisher Scoring iterations: 4
> >>
> >>> summary(Count1)
> >>
> >>
> >>
> >>
> >>
> >>
> >> Call:
> >>
> glm(formula = NegBinSuicide ~ Age + gender + Victimization *
> FamilySupport,
> >> family =
> negative.binomial(theta = 1.1303))
> >>
> >> Deviance
> Residuals:
> >> Min
> 1Q Median 3Q
> Max
> >> -1.6393 -0.4504
> -0.1679 0.2350 2.1676
> >>
> >>
> Coefficients:
> >>
>
> Estimate Std. Error t value
> >> Pr(>|t|)
> >>
> (Intercept)
> 0.60820 0.13779 4.414 2.49e-05
> >> ***
> >> Age
> 0.08836
> 0.04189 2.109 0.0373
> >> *
> >> gender
> 0.10983
> 0.17873 0.615 0.5402
> >> Victimization
> 0.73270 0.10776 6.799
> 6.82e-10
> >> ***
> >> FamilySupport
> 0.10213
> 0.15979 0.639 0.5241
> >>
> Victimization:FamilySupport -0.60146
> 0.24532 -2.452 0.0159 *
> >> ---
> >> Signif.
> codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> >>
> >> (Dispersion
> parameter for Negative Binomial(1.1303) family taken to
> be
> >> 0.4549082)
> >>
> >>
> Null deviance: 76.159 on 115 degrees of
> freedom
> >> Residual deviance: 35.101
> on 104 degrees of freedom
> >> (296 observations deleted
> due to missingness)
> >> AIC: 480.6
> >>
> >> Number of
> Fisher Scoring iterations: 15
> >>
> >>
> >>
> Alternatively, if there is a simpler way to plot hurdle
> regression output, or if anyone is award of another means of
> estimating NB models (I haven't had much luck with vglm
> from VGAM either), I would be happy to hear about that as
> well. I'm currently using the "visreg"
> >> package for plotting.
> >>
> >> Thanks!
> >>
> >>
> >>
> >>
> >>
> >>
> ______________________________________________
> >> R-help at r-project.org
> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal,
> self-contained, reproducible code.
> >
> > --
> > Peter Dalgaard,
> Professor,
> > Center for Statistics,
> Copenhagen Business School
> > Solbjerg
> Plads 3, 2000 Frederiksberg, Denmark
> >
> Phone: (+45)38153501
> > Email: pd.mes at cbs.dk Priv:
> PDalgd at gmail.com
> >
> >
> ______________________________________________
> > R-help at r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal,
> self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK
More information about the R-help
mailing list