[R] Custom caret metric based on prob-predictions/rankings

Max Kuhn mxkuhn at gmail.com
Fri Feb 10 14:50:31 CET 2012


I think you need to read the man pages and the four vignettes. A lot
of your questions have answers there.

If you don't specify the resampling indices, they ones generated for
you are saved in the train object:

> data(iris)
> TrainData <- iris[,1:4]
> TrainClasses <- iris[,5]
>
> knnFit1 <- train(TrainData, TrainClasses,
+                  method = "knn",
+                  preProcess = c("center", "scale"),
+                  tuneLength = 10,
+                  trControl = trainControl(method = "cv"))
Loading required package: class

Attaching package: ‘class’

The following object(s) are masked from ‘package:reshape’:

    condense

Warning message:
executing %dopar% sequentially: no parallel backend registered
> str(knnFit1$control$index)
List of 10
 $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ...
 $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ...
 $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ...
 $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ...
 $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ...
 $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ...
 $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ...
 $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
 $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ...

There is also a savePredictions argument that gives you the hold-out results.

I'm not sure which weights you are referring to.

On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang <yanghatespam at gmail.com> wrote:
> Actually, is there any way to get at additional information beyond the
> classProbs?  In particular, is there any way to find out the
> associated weights, or otherwise the row indices into the original
> model matrix corresponding to the tested instances?
>
> On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang <yanghatespam at gmail.com> wrote:
>> Oops, found trainControl's classProbs right after I sent!
>>
>> On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <yanghatespam at gmail.com> wrote:
>>> I'm dealing with classification problems, and I'm trying to specify a
>>> custom scoring metric (recall at p, ROC, etc.) that depends on not just
>>> the class output but the probability estimates, so that caret::train
>>> can choose the optimal tuning parameters based on this metric.
>>>
>>> However, when I supply a trainControl summaryFunction, the data given
>>> to it contains only class predictions, so the only metrics possible
>>> are things like accuracy, kappa, etc.
>>>
>>> Is there any way to do this that I'm looking?  If not, could I put
>>> this in as a feature request?  Thanks!
>>>
>>> --
>>> Yang Zhang
>>> http://yz.mit.edu/
>>
>>
>>
>> --
>> Yang Zhang
>> http://yz.mit.edu/
>
>
>
> --
> Yang Zhang
> http://yz.mit.edu/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Max



More information about the R-help mailing list