[R] Survival Analysis with an Historical Control

Mon Jul 21 17:33:59 CEST 2014

Hi Paul,
Sorry for the delayed reply.  I was away last week. I'm not clear what you want confirmed about your approach.

(a) "20.57"- computing the rejection region of the analysis.  The formulas implemented at the addresses you gave in your original post are from a reputable source - Lawless (1982) - at least that is claimed in the help file (http://www.swogstat.org/stat/Public/Help/survival1.html).  It seems that another person derived 20.57 from some combination of input.  I don't see how to back calculate what the input was from the information provided.  Perhaps elsewhere in your study protocol there is discussion of accrual time, et al. that can help you.

(b) "16" - the value of the parameter in the null hypothesis is very important as you noted in your response to Terry.  But this is not really a statistical question. It may be derived from historical data, expert opinion, regulatory mandate, or some combination of these and other factors.  Presumably this study was undertaken because 16 was an important number to somebody.

(c) performing the one-sample survival analysis itself.  This is what I did with the data you provided.

# Non-parametric
(km <- survfit(Surv(WKS, 1-CENS) ~ 1, data=hsv, type="kaplan-meier", conf.type="log", conf.int=0.9))
summary(km)

# Compare to median survival = 16
# (Used 90% CI above to get 0.05 one sided test here)
quantile(km, prob=0.5)$lower > 16

# Parametric
(paraExp <- survreg(Surv(WKS, 1-CENS) ~ 1, data=hsv, dist="exponential"))
summary(paraExp)

# Compare to median survival = 16
# That is, compare to beta0 = log(16/log(2)) = 3.1391...
# one sided test
pnorm(c((coef(paraExp) - log(16/log(2))) / sqrt(vcov(paraExp))), lower.tail=FALSE)

Chris

-----Original Message-----
From: Paul Miller [mailto:pjmiller_57 at yahoo.com] 
Sent: Thursday, July 10, 2014 8:59 AM
To: Andrews, Chris
Cc: r-help at r-project.org
Subject: RE: [R] Survival Analysis with an Historical Control

Hi Chris,

Thanks for pointing out the use of "View page source". Very helpful to know.

Do you happen to know anything about how to perform the analysis itself? I haven't been able to find anything confirming that the approach described in my original email (below) is correct. 

Thanks,

Paul

--------------------------------------------
On Wed, 7/9/14, Andrews, Chris <chrisaa at med.umich.edu> wrote:

 Subject: RE: [R] Survival Analysis with an Historical Control
 To: "Paul Miller" <pjmiller_57 at yahoo.com>, "r-help at r-project.org" <r-help at r-project.org>
 Received: Wednesday, July 9, 2014, 11:26 AM

 The code is actually
 available at the websites you provide.  Try "View page
 source" in your browser.  The most cryptic code
 isn't needed because the math functions (e.g, incomplete
 gamma function) are available in R.

 -----Original Message-----
 From: Paul Miller [mailto:pjmiller_57 at yahoo.com]

 Sent: Tuesday, July 08, 2014 12:00 PM
 To: r-help at r-project.org
 Subject: [R] Survival Analysis with an
 Historical Control

 Hello
 All,

 I'm trying to
 figure out how to perform a survival analysis with an
 historical control. I've spent some time looking online
 and in my boooks but haven't found much showing how to
 do this. Was wondering if there is a R package that can do
 it, or if there are resources somewhere that show the actual
 steps one takes, or if some knowledgeable person might be
 willing to share some code. 

 Here is a statement that describes the sort of
 analyis I'm being asked to do.

 A one-sample parametric test assuming an
 exponential form of survival was used to test the hypothesis
 that the treatment produces a median PFS no greater than the
 historical control PFS of 16 weeks.  A sample median PFS
 greater than 20.57 weeks would fall beyond the critical
 value associated with the null hypothesis, and would be
 considered statistically significant at alpha = .05, 1
 tailed.  

 My understanding
 is that the cutoff of 20.57 weeks was obtained using an
 online calculator that can be found at:

 http://www.swogstat.org/stat/public/one_survival.htm

 Thus far, I've been unable
 to determine what values were plugged into the calculator to
 get the cutoff.

 There's
 another calculator for a nonparamertric test that can be
 found at:

 http://www.swogstat.org/stat/public/one_nonparametric_survival.htm

 It would be nice to try doing
 this using both a parameteric and a non-parametric model.

 So my first question would be
 whether the approach outlined above is valid or if the
 analysis should be done some other way. If the basic idea is
 correct, is it relatively easy (for a Terry Therneau type
 genius) to implement the whole thing using R? The calculator
 is a great tool, but, if reasonable, it would be nice to be
 able to look at some code to see how the numbers actually
 get produced.

 Below are
 some sample survival data and code in case this proves
 helpful.

 Thanks,

 Paul

 ###################################
 #### Example Data: GD2 Vaccine ####
 ###################################

 connection <-
 textConnection("
 GD2  1   8
 12  GD2  3 -12 10  GD2  6 -52  7
 GD2 
 7  28 10  GD2  8  44  6  GD2 10  14  8
 GD2 12   3  8  GD2 14 -52  9 
 GD2 15  35 11
 GD2 18   6 13 
 GD2 20  12  7  GD2 23  -7 13
 GD2 24
 -52  9  GD2 26 -52 12  GD2 28  36 13
 GD2
 31 -52  8  GD2 33   9 10  GD2 34 -11 16
 GD2 36 -52  6  GD2 39  15 14  GD2 40  13
 13
 GD2 42  21 13  GD2 44 -24 16  GD2 46
 -52 13
 GD2 48  28  9  GD2  2  15  9 
 GD2  4 -44 10
 GD2  5  -2 12  GD2 
 9   8  7  GD2 11  12  7
 GD2
 13 -52  7  GD2 16  21  7  GD2 17  19 11
 GD2 19   6 16  GD2 21  10 16  GD2
 22 -15  6
 GD2 25   4 15  GD2
 27  -9  9  GD2 29  27 10
 GD2
 30   1 17  GD2 32  12  8  GD2 35  20  8
 GD2 37 -32  8  GD2 38  15  8  GD2
 41   5 14
 GD2 43  35 13  GD2
 45  28  9  GD2 47   6 15
 ")

 hsv
 <- data.frame(scan(connection, list(VAC="",
 PAT=0, WKS=0, X=0)))
 hsv <-
 transform(hsv, CENS=ifelse(WKS < 1, 1, 0),
 WKS=abs(WKS))
 head(hsv)

 require("survival")

 survObj <- Surv(hsv$WKS,
 hsv$CENS==0) ~ 1

 km <-
 survfit(survObj, type=c("kaplan-meier"))
 print(km)

 paraExp <- survreg(survObj,
 dist="exponential")
 print(paraExp)

 **********************************************************
 Electronic Mail is not secure, may not be read
 every day, and should not be used for urgent or sensitive
 issues 

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues