[R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Aug 29 20:36:52 CEST 2014


On 29/08/2014 17:46, Nick Livingston wrote:
> Thank you for your responses.
>
> Since my previous attempt to manually truncate my DV didn't work, I'm very interested in trying again using the zerotrun() function. However, I attempted to install "countreg" but received the following notification:
>
>          Warning in install.packages :
>                unable to access index for repository http://R-Forge.R-project.org/bin/macosx/contrib/3.0
>
>                package ‘countreg’ is available as a source package but not as a binary
>
>           Warning in install.packages :
>                package ‘countreg’ is not available (for R version 3.0.3)
>
> I received the same message when attempting to install it in version 3.1.0, and the latest version, 3.1.1. Am I missing something?

As the package does not contain compiled code, you can simply install 
from the sources.  (I just did so in the GUI: just untick 'binary'.)

> Thank you again. I appreciate your input.
>
> -Nick
> --------------------------------------------
> On Fri, 8/29/14, Achim Zeileis <Achim.Zeileis at uibk.ac.at> wrote:
>
>   Subject: Re: [R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"
>   To: "peter dalgaard" <pdalgd at gmail.com>
>   Cc: "Nick Livingston" <nlivingston at ymail.com>, r-help at r-project.org
>   Date: Friday, August 29, 2014, 5:26 AM
>
>   On Fri, 29 Aug 2014,
>   peter dalgaard wrote:
>
>   >
>   I'm no expert on hurdle models, but it seems that you
>   are unaware that
>   > the negative binomial
>   and the truncated negative binomial are quite
>   > different things.
>
>   Yes. You can replicate the truncated count part
>   of the hurdle model with
>   the zerotrunc()
>   function from the "countreg" package. The package
>   is not
>   yet on CRAN but can be easily
>   installed from R-Forge.
>
>   > -pd
>   >
>   >
>   > On 29 Aug 2014, at
>   05:57 , Nick Livingston <nlivingston at ymail.com>
>   wrote:
>   >
>   >> I have
>   sought consultation online and in person, to no avail. I
>   hope someone
>   >> on here might have
>   some insight. Any feedback would be most welcome.
>   >>
>   >> I am
>   attempting to plot predicted values from a two-component
>   hurdle model
>   >> (logistic [suicide
>   attempt yes/no] and negative binomial count [number of
>   >> attempts thereafter]). To do so, I
>   estimated each component separately using
>   >> glm (MASS). While I am able to
>   reproduce hurdle results for the logit
>   >> portion in glm, estimates for the
>   negative binomial count component are
>   >> different.
>   >>
>   >> Call:
>   >>
>   hurdle(formula = Suicide. ~ Age + gender + Victimization *
>   FamilySupport |
>   >> Age + gender +
>   Victimization * FamilySupport, dist = "negbin",
>   link =
>   >> "logit")
>   >>
>   >> Pearson
>   residuals:
>   >>     Min
>     1Q  Median      3Q     Max
>   >> -0.9816 -0.5187 -0.4094  0.2974
>   5.8820
>   >>
>   >>
>   Count model coefficients (truncated negbin with log
>   link):
>   >>
>
>      Estimate Std. Error z value
>   >> Pr(>|z|)
>   >>
>   (Intercept)                          -0.29150
>     0.33127  -0.880   0.3789
>   >> Age
>                 0.17068
>   0.07556   2.259   0.0239
>   >> *
>   >> gender
>
>      0.28273
>   0.31614   0.894   0.3712
>   >> Victimization
>            1.08405
>   0.18157   5.971 2.36e-09
>   >>
>   ***
>   >> FamilySupport
>             0.33629
>   0.29302   1.148   0.2511
>   >> Victimization:FamilySupport -0.96831
>     0.46841  -2.067   0.0387 *
>   >> Log(theta)
>             0.12245
>   0.54102   0.226   0.8209
>   >> Zero hurdle model coefficients
>   (binomial with logit link):
>   >>
>
>       Estimate Std. Error z value
>   >>
>   Pr(>|z|)
>   >> (Intercept)
>
>      -0.547051   0.215981  -2.533
>   0.01131
>   >> *
>   >>
>   Age
>      -0.154493   0.063994  -2.414
>   >> 0.01577 *
>   >>
>   gender
>      -0.030942   0.284868  -0.109
>   0.91350
>   >> Victimization
>
>   1.073956   0.338015   3.177
>   0.00149
>   >> **
>   >>
>   FamilySupport
>      -0.380360   0.247530  -1.537
>   0.12439
>   >>
>   Victimization\:FamilySupport
>   -0.813329   0.399905  -2.034  0.04197 *
>   >> ---
>   >> Signif.
>   codes:  0 '***' 0.001 '**' 0.01 '*'
>   0.05 '.' 0.1 ' ' 1
>   >>
>   >> Theta: count
>   = 1.1303
>   >> Number of iterations in
>   BFGS optimization: 23
>   >>
>   Log-likelihood: -374.3 on 25 Df
>   >>>
>   summary(logistic)
>   >>
>   >>
>   >>
>   >>
>   >> Call:
>   >> glm(formula = SuicideBinary ~ Age +
>   gender = Victimization * FamilySupport,
>   >> family = "binomial")
>   >>
>   >> Deviance
>   Residuals:
>   >>     Min
>      1Q   Median       3Q
>       Max
>   >> -1.9948  -0.8470
>   -0.6686   1.1160   2.0805
>   >>
>   >>
>   Coefficients:
>   >>
>
>   Estimate Std. Error z value
>   >>
>   Pr(>|z|)
>   >> (Intercept)
>                     -0.547051   0.215981
>   -2.533  0.01131 *
>   >> Age
>
>   -0.154493   0.063994  -2.414  0.01577
>   >> *
>   >> gender
>
>   -0.030942   0.284868  -0.109  0.91350
>   >> Victimization
>
>      1.073956   0.338014   3.177
>   0.00149
>   >> **
>   >>
>   FamilySupport
>   -0.380360   0.247530  -1.537  0.12439
>   >> Victimization:FamilySupport
>   -0.813329   0.399904  -2.034  0.04197 *
>   >> ---
>   >> Signif.
>   codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>   >>
>   >> (Dispersion
>   parameter for binomial family taken to be 1)
>   >>
>   >>
>      Null deviance: 452.54  on 359  degrees of
>   freedom
>   >> Residual deviance: 408.24
>   on 348  degrees of freedom
>   >>   (52 observations deleted
>   due to missingness)
>   >> AIC: 432.24
>   >>
>   >> Number of
>   Fisher Scoring iterations: 4
>   >>
>   >>> summary(Count1)
>   >>
>   >>
>   >>
>   >>
>   >>
>   >>
>   >> Call:
>   >>
>   glm(formula = NegBinSuicide ~ Age + gender + Victimization *
>   FamilySupport,
>   >> family =
>   negative.binomial(theta = 1.1303))
>   >>
>   >> Deviance
>   Residuals:
>   >>     Min
>      1Q   Median       3Q
>       Max
>   >> -1.6393  -0.4504
>   -0.1679   0.2350   2.1676
>   >>
>   >>
>   Coefficients:
>   >>
>
>      Estimate Std. Error t value
>   >> Pr(>|t|)
>   >>
>   (Intercept)
>   0.60820    0.13779   4.414 2.49e-05
>   >> ***
>   >> Age
>                                     0.08836
>     0.04189   2.109   0.0373
>   >> *
>   >> gender
>                                 0.10983
>   0.17873   0.615   0.5402
>   >> Victimization
>             0.73270    0.10776   6.799
>   6.82e-10
>   >> ***
>   >> FamilySupport
>           0.10213
>   0.15979   0.639   0.5241
>   >>
>   Victimization:FamilySupport   -0.60146
>   0.24532  -2.452   0.0159 *
>   >> ---
>   >> Signif.
>   codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>   >>
>   >> (Dispersion
>   parameter for Negative Binomial(1.1303) family taken to
>   be
>   >> 0.4549082)
>   >>
>   >>
>      Null deviance: 76.159  on 115  degrees of
>   freedom
>   >> Residual deviance: 35.101
>   on 104  degrees of freedom
>   >>   (296 observations deleted
>   due to missingness)
>   >> AIC: 480.6
>   >>
>   >> Number of
>   Fisher Scoring iterations: 15
>   >>
>   >>
>   >>
>   Alternatively, if there is a simpler way to plot hurdle
>   regression output, or if anyone is award of another means of
>   estimating NB models (I haven't had much luck with vglm
>   from VGAM either), I would be happy to hear about that as
>   well. I'm currently using the "visreg"
>   >> package for plotting.
>   >>
>   >> Thanks!
>   >>
>   >>
>   >>
>   >>
>   >>
>   >>
>   ______________________________________________
>   >> R-help at r-project.org
>   mailing list
>   >> https://stat.ethz.ch/mailman/listinfo/r-help
>   >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>   >> and provide commented, minimal,
>   self-contained, reproducible code.
>   >
>   > --
>   > Peter Dalgaard,
>   Professor,
>   > Center for Statistics,
>   Copenhagen Business School
>   > Solbjerg
>   Plads 3, 2000 Frederiksberg, Denmark
>   >
>   Phone: (+45)38153501
>   > Email: pd.mes at cbs.dk  Priv:
>   PDalgd at gmail.com
>   >
>   >
>   ______________________________________________
>   > R-help at r-project.org
>   mailing list
>   > https://stat.ethz.ch/mailman/listinfo/r-help
>   > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>   > and provide commented, minimal,
>   self-contained, reproducible code.
>   >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK



More information about the R-help mailing list