[R] Appropriate tests for logistic regression with a continuous predictor variable and Bernoulli response variable

Fri Jul 9 14:58:21 CEST 2010

Have a look at
?anova
or rather
?anova.glm

hth

Am 09.07.2010 04:46, schrieb Kiyoshi Sasaki:
> I have a data with binary response variable, repcnd (pregnant or not) and one predictor continuous variable, svl (body size) as shown below. I did Hosmer-Lemeshow test as a goodness of fit (as suggested by a kind “R-helper” previously). To test whether the predictor (svl, or body size) has significant effect on predicting whether or not a female snake is pregnant, I used the differences between null deviance and residual deviance using a code as following:
>
>  
> 1-pchisq(mod.fit$null.deviance - mod.fit$deviance, mod.fit$df.null - mod.fit$df.residual)
>  
> Could anyone tell me whether I did the test properly? I did this test because I thought Wald test/z score listed in the output from "summary(mod.fit)" is not appropriate for a kind of data I have.  Does R have automated function to run appropriate tests? I have pasted my R output below.
>  
> Thank you in advance for your time and help.
>  
> Kiyoshi
>  
>  
>             repcnd             svl
> 1          1                      51.5
> 2          1                      52.5
> <edited>
> 294      0                      59.8
> 298      1                      60.0
> 300      1                      51.7
> 301      1                      57.4
> 302      1                      60.9
> 303      0                      56.8
> 304      0                      50.0
> -------------------
>   
>> mod.fit <- glm(formula = gb.no.M$repcnd ~ gb.no.M$svl, family = binomial(link = logit), data = gb.no.M, na.action = na.exclude, control = list(epsilon = 0.0001, maxit = 50, trace = F))
>> summary(mod.fit)
>>     
>  
> Call:
> glm(formula = gb.no.M$repcnd ~ gb.no.M$svl, family = binomial(link = logit), 
>     data = gb.no.M, na.action = na.exclude, control = list(epsilon = 1e-04, 
>         maxit = 50, trace = F))
>  
> Deviance Residuals: 
>    Min      1Q  Median      3Q     Max  
> -1.757  -1.109   0.734   1.113   1.632  
>  
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)    
> (Intercept) -7.08565    1.84106  -3.849 0.000119 ***
> gb.no.M$svl  0.13529    0.03474   3.894 9.85e-05 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
>  
> (Dispersion parameter for binomial family taken to be 1)
>  
>     Null deviance: 301.92  on 217  degrees of freedom
> Residual deviance: 285.04  on 216  degrees of freedom
>   (8 observations deleted due to missingness)
> AIC: 289.04
>  
> Number of Fisher Scoring iterations: 3
> -------------------------------------------------------------------------------
>   
>> Hosmer-Lemeshow test
>>
>> hosmerlem <- function (y, yhat, g = 10) 
>>     
> + {
> + cutyhat <- cut(yhat, breaks = quantile(yhat, probs = seq(0, 1, 1/g)), include.lowest = T)
> + obs <- xtabs(cbind(1 - y, y) ~ cutyhat)
> + expect <- xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
> +  chisq <- sum((obs - expect)^2/expect)
> + P <- 1 - pchisq(chisq, g - 2)
> + c("X^2" = chisq, Df = g - 2, "P(>Chi)" = P)
> + }
>   
>> mod.fit <- glm(formula = no.NA$repcnd ~  no.NA$svl, family = binomial(link = logit), data =  no.NA, na.action = na.exclude, control = list(epsilon = 0.0001, maxit = 50, trace = F))
>>     
>  
>   
>> hosmerlem(no.NA$repcnd, fitted(mod.fit))
>>     
>       X^2        Df   P(>Chi) 
> 6.8742531 8.0000000 0.5502587
> ---------------------------------------------------------------------------------------------------
>   
>> list(p.value = round(1-pchisq(mod.fit$null.deviance - mod.fit$deviance,
>>     
> + mod.fit$df.null- mod.fit$df.residual),6), 
> + df = mod.fit$df.null- mod.fit$df.residual,
> + change = mod.fit$null.deviance - mod.fit$deviance)
>  
> $p.value
> [1] 4e-05
>  
> $df
> [1] 1
>  
> $change
> [1] 16.87895
>
>
>       
> 	[[alternative HTML version deleted]]
>
>   
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   

-- 
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790