[R] vglm(), t values and p values

Federico Calboli f.calboli at imperial.ac.uk
Wed Nov 4 19:44:45 CET 2009


On 4 Nov 2009, at 18:40, Gavin Simpson wrote:
>>
>> Additionally, while extracting the t value is a piece of cake with
>> polr(), the p-value I get a nowhere close to a null distribution.
>
> Yes - I see that polr() also doesn't produce p-values in the output  
> from
> summary. You can use it to get "a" p-value for x if you use a LRT; fit
> the model using polr() with and without 'x' and then supply both  
> fitted
> models to the anova() method for polr objects.

I did think of that solution, but it's the last thing on my cards...  
I'm trying to analyse 1.5 million markers, and fitting two models at a  
time for every marker starts to be a bit on the heavy side. If all  
else fails that's the way to go.
>
>>
>> I will try lms() and hope for the best.
>
> Sorry, I meant the lrm() function in package rms

Ta,

F


>
> G
>
>>>
>>> I haven't really used either of these functions in earnest, but  
>>> one or
>>> both may provide the p-values you desire, out of the box.
>>
>> I hope so!
>>
>> Thanks,
>>
>> F
>>
>>
>>>
>>> HTH
>>>
>>> G
>>>
>>>>
>>>> My response variable is the severity of diseases, going from 0 to 5
>>>> (the
>>>> severity is actually an ordered factor).
>>>>
>>>> The independent variables are: 1 genetic marker, time of medical
>>>> observation,
>>>> age, sex. What I *need* is a p-value for the genetic marker.
>>>> Because I have ~1.5
>>>> million markers I'd rather not faffing around too much.
>>>>
>>>> My model is:
>>>>
>>>>> mod.vglm = vglm(disease.status ~ x + time + age + sex, family =
>>>> cumulative(par = T))
>>>>
>>>> where x is my genetic marker, coded as 0/1/2, time is days of
>>>> medical observation.
>>>>
>>>>> summary(mod.vglm) works:
>>>>
>>>> Call:
>>>> vglm(formula = disease.status ~ x + time + age + sex, family =
>>>> cumulative(par = T))
>>>>
>>>> Pearson Residuals:
>>>>                   Min       1Q   Median       3Q     Max
>>>> logit(P[Y<=1]) -0.6642 -0.28704 -0.18329 -0.11681  3.8919
>>>> logit(P[Y<=2]) -2.5580 -0.48080 -0.23315  0.47388  2.5983
>>>> logit(P[Y<=3]) -2.1565 -0.56961  0.22089  0.44349 10.7964
>>>> logit(P[Y<=4]) -3.3175  0.13064  0.20117  0.43176 12.5233
>>>>
>>>> Coefficients:
>>>>                    Value Std. Error  t value
>>>> (Intercept):1 -2.4460e+00 4.2791e-01  -5.7162
>>>> (Intercept):2 -7.1078e-01 4.1628e-01  -1.7074
>>>> (Intercept):3  3.7619e-01 4.1545e-01   0.9055
>>>> (Intercept):4  1.7467e+00 4.2092e-01   4.1496
>>>> x              4.1421e-01 1.9762e-01   2.0959
>>>> time          -3.6021e-04 3.0387e-05 -11.8540
>>>> age           -2.6115e-05 9.2504e-06  -2.8232
>>>> sexM           1.0188e-01 1.2491e-01   0.8156
>>>>
>>>> Number of linear predictors:  4
>>>>
>>>> Names of linear predictors:
>>>> logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])
>>>>
>>>> Dispersion Parameter for cumulative family:   1
>>>>
>>>> Residual Deviance: 2475.937 on 3460 degrees of freedom
>>>>
>>>> Log-likelihood: -1237.969 on 3460 degrees of freedom
>>>>
>>>> #######################
>>>>
>>>> So here are my questions:
>>>>
>>>> 1) I need to get the t value for x, so I can use "1 - pt(tvalue,1)"
>>>> to find some
>>>> sort of probability value for x. That's not trivial. Additionally,
>>>> I assume df
>>>> for x is 1, hence I plan to use  "1 - pt(tvalue,1)", though I might
>>>> well be
>>>> wrong. In any case getting the darned t value seems impossible
>>>>
>>>> 2) because of the difficulty of getting (1), it there a way of
>>>> getting vglm() to
>>>> spit out a p-value for x please?
>>>>
>>>> I do recon many people might scoff at my crass desire for a p-
>>>> value, but I'm
>>>> dealing with some dire phenotype in a whole genome analysis where
>>>> the *only*
>>>> thing that matters are p-values. I have to be quite unsophysticated
>>>> I'm afraid.
>>>>
>>>> Best,
>>>>
>>>> Federico
>>>>
>>>>
>>> -- 
>>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~ 
>>> %~%
>>> Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>>> ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>>> Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>>> Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>>> UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
>>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~ 
>>> %~%
>>>
>>
>> --
>> Federico C. F. Calboli
>> Department of Epidemiology and Public Health
>> Imperial College, St. Mary's Campus
>> Norfolk Place, London W2 1PG
>>
>> Tel +44 (0)20 75941602   Fax +44 (0)20 75943193
>>
>> f.calboli [.a.t] imperial.ac.uk
>> f.calboli [.a.t] gmail.com
>>
>>
>>
>>
>>
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
> ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
> Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com




More information about the R-help mailing list