[R] Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

Thu Jan 21 16:01:51 CET 2016

I read the digest form which puts me behind, plus the last 2 days have been solid meetings 
with an external advisory group so I missed the initial query.   Three responses.

1. The clogit routine sets the data up properly and then calls a stratified Cox model.  If 
you want the survConcordance routine to give the same answer, it also needs to know about 
the strata
     survConcordance (Surv(rep(1, 76L), resp) ~ predict(fit) + strata(ID), data=dat)
I'm not surprised that you get a very different answer with/without strata.

2. I've never thought of using a robust variance for the matched case/control model.  I'm 
having a hard time wrapping my head around what you would expect that to accomplish 
(statistically).  Subjects are already matched on someone from the same site, so where 
does a per-site effect creep in?  Assuming there is a good reason and I just don't see it 
(not an unwarranted assumption), I'm not aware of any work on what an appropriate variance 
would be for the concordance in that case.

3. I need to think about the large variance issue.

Terry Therneau

On 01/20/2016 08:09 PM, r-help-request at r-project.org wrote:
> Hi,
>
> I'm running conditional logistic regression with survival::clogit. I have
> "1-1 case-control" data, i.e., there is 1 case and 1 control in each strata.
>
> Model:
> fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron",
> data = dat)
> Where resp is 1's and 0's, and x1 and x2 are both continuous.
>
> Predictors are both significant. A snippet of summary(fit):
> Concordance= 0.763  (se = 0.5 )
> Rsquare= 0.304   (max possible= 0.5 )
> Likelihood ratio test= 27.54  on 2 df,   p=1.047e-06
> Wald test            = 17.19  on 2 df,   p=0.0001853
> Score (logrank) test = 17.43  on 2 df,   p=0.0001644,   Robust = 6.66
>   p=0.03574
>
> The concordance estimate seems good but the SE is HUGE.
>
> I get a very different estimate from the survConcordance function, which I
> know says computes concordance for a "single continuous covariate", but it
> runs on my model with 2 continuous covariates....
>
> survConcordance(Surv(rep(1, 76L), resp) ~ predict(fit), dat)
> n= 76
> Concordance= 0.9106648 se= 0.09365047
> concordant  discordant   tied.risk   tied.time    std(c-d)
>   1315.0000   129.0000     0.0000   703.0000   270.4626
>
> Are both of these concordance estimates valid but providing different
> information?
> Is one more appropriate for measuring "performance" (in the AUC sense) of
> conditional logistic models?
> Is it possible that the HUGE SE estimate represents a convergence problem
> (no warnings were thrown when fit the model), or is this model just useless?
>
> Thanks!