[R] ROCR Issue: Averaging Across Multiple Classifier Runs in ROC Curve

Wed Nov 25 09:51:50 CET 2009

Dear R-philes,

I am having some trouble averaging across multiple runs of a  
classifier in an ROC Curve.  I am using the ROCR package and the  
plot() method.

First, I initialize a list with two elements where each element is a  
list of predictions and labels:

vowel.ROC <- list(predictions=list(), labels=list())

For every run of the classifier, I append the scores and labels to  
their corresponding list elements as follows:

vowel.ROC$predictions <- append(vowel.ROC$predictions,
list(scores))
vowel.ROC$labels <- append(vowel.ROC$labels, list(numGrps))

The R display of vowel.ROC looks like the following (not sure why NULL  
is there):

List of 2
  $ predictions:List of 4
   ..$ : num [1:148] 0.234 0.293 0.275 0.391 0.191 ...
   ..$ : num [1:152] 0.99 0.974 0.934 0.767 0.934 ...
   ..$ : num [1:293] 0.05009 0.00739 0.03211 0.85894 0.18265 ...
   ..$ : num [1:247] 0.0184 0.0168 0.0942 0.0149 0.089 ...
  $ labels     :List of 4
   ..$ : num [1:148] 1 1 1 1 1 2 2 1 1 1 ...
   ..$ : num [1:152] 2 2 2 2 2 1 1 1 1 2 ...
   ..$ : num [1:293] 1 1 1 1 2 1 1 2 2 1 ...
   ..$ : num [1:247] 1 1 1 1 1 2 1 1 1 2 ...
NULL

I make prediction and performance objects and plot the resulting  
performance object as shown below.

pred <- prediction(vowel.ROC$predictions, vowel.ROC$labels)
perf <- performance(pred, "tpr", "fpr")
plot(perf, avg="horizontal", spread.estimate="boxplot")

Instead of getting an average ROC curve, I am getting four separate  
ones on the same plot.  This is nice but I prefer the average curve.

Any direction would be much appreciated.

Regards,

Na'im