[R] C-statistic comparison with partially paired datasets

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Aug 13 14:00:49 CEST 2009

Hanneke Wijnhoven wrote:
> Frank,
> Thank you for your quick response!
> I want to compare the discriminative capacity of different 
> anthropometric measures in predicting mortality, focussing on the "thin" 
> site of these measures.
> Since these associations are not linear (U shaped for BMI and inversily 
> J-shaped for mid-upper arm circumference) and I do not want to include 
> the prediction by "obesity", I am using all values below the median of 
> each separate measure to calculate a C-statistic (below the median, the 
> association is approximately linear).
> As a result, some different and some overlapping cases are included.
> I understand your point though.
> Any suggestion is welcome.
> Hanneke

Subsetting the data will make the two task difficulties unequal, I fear. 
  This would make it difficult to compare predictive discrimination indexes.

I think it would be better to fit splines to the continuous predictors, 
to allow for a unified analysis over the whole range.  Then everything 
is paired.


> Frank E Harrell Jr schreef:
>> Hanneke Wijnhoven wrote:
>>> Does anyone know of an R-function or method to compare two 
>>> C-statistics (Harrells's C - rcorr.cens) obtained from 2 different 
>>> models in partially paired datasets (i.e. some similar and some 
>>> different cases), with one continuous independent variable in each 
>>> separate model? (in a survival analysis context)?
>>> I have noticed that the rcorrp.cens function can be used for paired 
>>> data.
>>>   Thanks for any help,
>>> Hanneke Wijnhoven
>> Hanneke,
>> I'm having trouble seeing how the unpaired observations can contribute 
>> information in general.  If for example all of the observations were 
>> unpaired, one C-statistic might be larger because it came from a 
>> dataset with more extreme observations that were easier to discriminate.
>> Frank

Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

More information about the R-help mailing list