[R] Question on approximations of full logistic regression model
Frank Harrell
f.harrell at vanderbilt.edu
Mon May 16 15:01:16 CEST 2011
The choice is not clear, and requires some simulations to estimate the
average absolute error of the covariance matrix estimators.
Frank
細田弘吉 wrote:
>
> Thank you for your reply, Prof. Harrell.
>
> I agree with you. Dropping only one variable does not actually help a lot.
>
> I have one more question.
> During analysis of this model I found that the confidence
> intervals (CIs) of some coefficients provided by bootstrapping (bootcov
> function in rms package) was narrower than CIs provided by usual
> variance-covariance matrix and CIs of other coefficients wider. My data
> has no cluster structure. I am wondering which CIs are better.
> I guess bootstrapping one, but is it right?
>
> I would appreciate your help in advance.
> --
> KH
>
>
>
> (11/05/16 12:25), Frank Harrell wrote:
>> I think you are doing this correctly except for one thing. The
>> validation
>> and other inferential calculations should be done on the full model. Use
>> the approximate model to get a simpler nomogram but not to get standard
>> errors. With only dropping one variable you might consider just running
>> the
>> nomogram on the entire model.
>> Frank
>>
>>
>> KH wrote:
>>>
>>> Hi,
>>> I am trying to construct a logistic regression model from my data (104
>>> patients and 25 events). I build a full model consisting of five
>>> predictors with the use of penalization by rms package (lrm, pentrace
>>> etc) because of events per variable issue. Then, I tried to approximate
>>> the full model by step-down technique predicting L from all of the
>>> componet variables using ordinary least squares (ols in rms package) as
>>> the followings. I would like to know whether I am doing right or not.
>>>
>>>> library(rms)
>>>> plogit<- predict(full.model)
>>>> full.ols<- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure,
>>>> sigma=1)
>>>> fastbw(full.ols, aics=1e10)
>>>
>>> Deleted Chi-Sq d.f. P Residual d.f. P AIC R2
>>> stenosis 1.41 1 0.2354 1.41 1 0.2354 -0.59 0.991
>>> x2 16.78 1 0.0000 18.19 2 0.0001 14.19 0.882
>>> procedure 26.12 1 0.0000 44.31 3 0.0000 38.31 0.711
>>> ClinicalScore 25.75 1 0.0000 70.06 4 0.0000 62.06 0.544
>>> x1 83.42 1 0.0000 153.49 5 0.0000 143.49 0.000
>>>
>>> Then, fitted an approximation to the full model using most imprtant
>>> variable (R^2 for predictions from the reduced model against the
>>> original Y drops below 0.95), that is, dropping "stenosis".
>>>
>>>> full.ols.approx<- ols(plogit ~ x1+x2+ClinicalScore+procedure)
>>>> full.ols.approx$stats
>>> n Model L.R. d.f. R2 g Sigma
>>> 104.0000000 487.9006640 4.0000000 0.9908257 1.3341718 0.1192622
>>>
>>> This approximate model had R^2 against the full model of 0.99.
>>> Therefore, I updated the original full logistic model dropping
>>> "stenosis" as predictor.
>>>
>>>> full.approx.lrm<- update(full.model, ~ . -stenosis)
>>>
>>>> validate(full.model, bw=F, B=1000)
>>> index.orig training test optimism index.corrected n
>>> Dxy 0.6425 0.7017 0.6131 0.0887 0.5539 1000
>>> R2 0.3270 0.3716 0.3335 0.0382 0.2888 1000
>>> Intercept 0.0000 0.0000 0.0821 -0.0821 0.0821 1000
>>> Slope 1.0000 1.0000 1.0548 -0.0548 1.0548 1000
>>> Emax 0.0000 0.0000 0.0263 0.0263 0.0263 1000
>>>
>>>> validate(full.approx.lrm, bw=F, B=1000)
>>> index.orig training test optimism index.corrected n
>>> Dxy 0.6446 0.6891 0.6265 0.0626 0.5820 1000
>>> R2 0.3245 0.3592 0.3428 0.0164 0.3081 1000
>>> Intercept 0.0000 0.0000 0.1281 -0.1281 0.1281 1000
>>> Slope 1.0000 1.0000 1.1104 -0.1104 1.1104 1000
>>> Emax 0.0000 0.0000 0.0444 0.0444 0.0444 1000
>>>
>>> Validatin revealed this approximation was not bad.
>>> Then, I made a nomogram.
>>>
>>>> full.approx.lrm.nom<- nomogram(full.approx.lrm,
>>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
>>>> plot(full.approx.lrm.nom)
>>>
>>> Another nomogram using ols model,
>>>
>>>> full.ols.approx.nom<- nomogram(full.ols.approx,
>>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
>>>> plot(full.ols.approx.nom)
>>>
>>> These two nomograms are very similar but a little bit different.
>>>
>>> My questions are;
>>>
>>> 1. Am I doing right?
>>>
>>> 2. Which nomogram is correct
>>>
>>> I would appreciate your help in advance.
>>>
>>> --
>>> KH
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> -----
>> Frank Harrell
>> Department of Biostatistics, Vanderbilt University
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3525372.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> E-mail address
> Office: khosoda at med.kobe-u.ac.jp
> Home : khosoda at venus.dti.ne.jp
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3526155.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list