[R] Clogit or LRM?
Noah Silverman
noah at smartmediacorp.com
Wed Aug 26 01:25:29 CEST 2009
Hello
I believe that I'm getting very close in my modeling application.
I've come across a challenge that I am unable to solve and would really
appreciate the group's opinion.
I've been using the val.prob function from the Design library (Thanks
Frank!!) to both evaluate and visualize my model.
From the scores and graph, it appears as my model is very accurate in
predicting probabilities correctly. Please see attachment "graph1.pdf"
Since I'm scoring horse races, I assume that I need to "normalize" the
predicted probabilities by race. (Described in Bentor.)
I am calculating a conditional logit manually since there is a bug in
the Survival library for this function.
A val.prob function applied to my conditional logit score shows an
interesting result. The line is almost perfectly parallel to the
"ideal" mark on the graph, but is offset by a significant amount. My
first thought is that this indicates an error in my calculation
somewhere. Please see attachment "graph2.pdf"
Below is the two step process that I used for the conditional logit.
--------------------------------------------------
1) First a standard logistic regression is calculated on two variables:
model <- lrm(label ~ val1 + val2, data = traindata )
This gives me the following results:
Coef S.E. Wald Z P
Intercept 1.8065 0.05137 35.16 0
val1 0.8105 0.02567 31.57 0
val2 0.5218 0.04308 12.11 0
2) I then calculate a conditional logit:
testdata$log_int <- exp( model$coefficients[2] * model$val1 +
model$coefficients[3] * model$val2)
for(race in testdata$races){
testlogdata$c_prob[testdata$code== race] <-
testdata$log_int[testdata$race== race] /
sum(testdata$log_int[testlogdata$race == race])
}
---------------------------------------------------
Do you have any idea why this might be happening? Did I miss something
in my calculation?
Additionally, please notice the "Logistic Calibration" line on graph1.
It appears almost perfect. My thought is that whatever transformation
the val.prob is doing to my predictions is helping. How would I
store/access those values?
Once I can finalize the prediction of probabilities, then I can focus on
the application to a betting model. Having a high level of confidence
in my models predictions is obviously the first step.
I really appreciate it.
Thanks!
-Noah
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph2.pdf
Type: application/pdf
Size: 290782 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090825/157cd8e1/attachment-0004.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph1.pdf
Type: application/pdf
Size: 289181 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090825/157cd8e1/attachment-0005.pdf>
More information about the R-help
mailing list