[R] Practical work with logistic regression
Frank E Harrell Jr
f.harrell at Vanderbilt.Edu
Fri Apr 23 04:41:10 CEST 2010
Claus O'Rourke wrote:
> Dear all,
>
> I have a couple of short noob questions for whoever can take them. I'm
> from a very non-stats background so sorry for offending anybody with
> stupid questions ! :-)
>
> I have been using logistic regression care of glm to analyse a binary
> dependent variable against a couple of independent variables. All has
> gone well so far. In my work I have to compare the accuracy of
> analysis to a C4.5 machine learning approach. With the machine
> learning, a straight-forward measure of the quality of the classifier
> is simply the percentage of correctly classified instances. I can
> calculate this for the resultant model by comparing predictions to
> original values 'manually'. My question: is this not automatically -
> or easily - calculated in the produced model or the summary of that
> model?
The percent classified correctly is an improper scoring rule that will
lead to a selection of a bogus model. You can easily find examples
where adding a very important variable to a binary logistic model
results in a decrease in the percent "correct".
Frank
>
> I want to use my model in real time to produce results for new inputs.
> Basically this model is to be used as a classifier for a robot in real
> time. Can anyone suggest the best way that a produced model can be
> used directly in external code once the model has been developed in R?
> If my external code is in Java, then using jri is one option. A more
> efficient method would be to take the intercept and coefficients and
> actually code up the function in the appropriate programming language.
> Has anyone ever tried doing this?
>
> Apologies again for the stupid questions, but the sooner I get some of
> these things straight, the better.
>
> Claus
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list