[R] CoxME: Family relatedness

Mon Sep 15 23:20:18 CEST 2014

I would have caught this tomorrow (I read the digest).
Some thoughts:

1. Skip the entire step of subsetting the death.kmat object.  The coxme function knows how 
to do this on its own, and is more likely to get it correct.  My version of your code would be
   deathdat.kmat <- 2* with(deathdat, makekinship(famid, id, faid, moid))
   model3 <- coxme(Surv(Survival, Event) ~ Sex + strata(cohort) + SNP1 + SNP2 + SNP3 + (1|id),
                 data=death.dat, varlist=deathdat.kmat)

This all assumes that the "id" variable is unique.  If family 3 and family 4 both have an 
id of "1", then the coxme call can't match up rows in the data to rows/cols in the kinship 
matrix uniquely.  But that is simple to fix.
The kinship matrix K has .5 on the diagonal, by definition, but when used as a correlation 
most folks prefer to use 2K.  (This causes mixups since some software adds the "2" for 
you, but coxme does not.)

2. The model above is the correct covariance structure for a set of families.  There is a 
single intercept per subject, with a complex correlation matrix.  The simpler "per family" 
frailty model would be

     model4 <- coxme(Surv(Survival, Event) ~ Sex + strata(cohort) + SNP1 + SNP2 + SNP3 + 
(1|famid), death.dat)

This model lets each family have a separate risk, with everyone in the same family sharing 
the exact same risk.  It is less general than model3 above which lets a family have higher 
risk plus has variation between family members.

A model with both per-subject and per family terms is identical to one with a covariance 
matrix of s1 K + s2 B, where K is the kinship matrix, B is a block diagonal matrix which 
has a solid block of "1" for each family, and s1 s2 are the fitted variance coefficients.

   I don't find this intuitive, but have seen the argument that "B" is a shared 
environmental effect.  (Perhaps because I have large family trees where they do not all 
live together).  If you want such a model:
    model5 <- coxme(...... + (1|id) + (1|famid), death.dat, varlist=deathdat.kmat)

(When the varlist is too short the program uses the default for remaining terms).

3. To go further you will need to tell us what you are trying to model, as math formulas 
not as R code.

4. The error messages you got would more properly read "I'm confused" on the part of the 
program.  They cases of something I would never do, so never got that message.  Therefore 
useful for me to see.  Continuous variables go to the left of the | and categoricals to 
the right of the |.  Having a family id to the left makes no sense at all.

Terry Therneau

On 09/15/2014 03:20 PM, Marie Dogherty wrote:
> Dr. Therneau,
>
> I was wondering if you had a spare minute, if you could view a post in the R forum:
>
> http://r.789695.n4.nabble.com/CoxME-family-relatedness-td4696976.html
>
> I would appreciate it, I'm stuck and out of ideas!
>
> Many thanks
>
> Marie.

-----------Original post ----------

Hello all,

I have a table like this, with ~300 individuals:

Famid Id  Faid Moid Cohort  Sex  Survival Event SNP1 SNP2 SNP3

1    1    0    0    0    0    10   1    0    1    0

2    2    0    0    1    1    20   1    0    0    0

2    3    0    0    0    0    25   1    0    1    0

4    5    1    2    0    0    35   1    0    1    1

4    6    1    2    0    0    35   0    1    0    1

famid=family id, id=id of person,faid=father id,moid=mother id.

My question of interest: What impact does SNP1/SNP2/SNP3 have on survival of individuals 
(Id), after accounting for possible effects due to family relatedness (Famid).

So I want to account for family-specific frailty effects, and individual frailty effects 
according to degree of relationship between individuals.

The commands I've used are from this paper: http://www.ncbi.nlm.nih.gov/pubmed/21786277

Library(survival)
Library(coxme)
Library(kinship2)
Library(bdsmatrix)

Death.dat <-read.table(“Table”,header=T)

#Make a kinship matrix for the whole study
deathdat.kmat 
<-makekinship(famid=death.dat$famid,id=death.dat$id,father=death.dat$faid,mother=death.dat$moid)

##omit linker individuals with no phenotypic data, used only to ##properly specify the 
pedigree when calculating kinship ##coefficints:
death.dat1<-subset(death.dat,!is.na(Survival))

##temp is an array with the indices of the individuals with Survival years:
all <-dimnames(deathdat.kmat)[[1]]
temp <-which(!is.na(death.dat$Survival[match(all,death.dat$id)]))

##kinship matrix for the subset with phenotype Survival:
deathdat1.kmat <-deathdat.kmat[temp,temp]

If I type:

model3 
<-coxme(Surv(Survival,Event)~Sex+strata(Cohort)+SNP1+SNP2+SNP3+(1+id|famid),data=death.dat,varlist=deathdat1.kmat)

I get:

“Error in coxme(Surv(Survival, Event) ~ Sex + strata(Cohort) + SNP1 +  :
   In random term 1: Mlist cannot have both covariates and grouping”

Whereas if I type:

model3 
<-coxme(Surv(Survival,Event)~Sex+strata(Cohort)+SNP1+SNP2+SNP3,(id|famid),data=death.dat1,varlist=deathdat1.kmat)

I get:

Error in coxme(Surv(Survival, Event) ~ Sex + strata(Cohort) + SNP1 +  :
   No observations remain in the data set
In addition: Warning message:
In Ops.factor(id, famid) : | not meaningful for factors

Eventually, I would like to, once I has this piece of code working, use:

famblockf.mat<-bdsBlock(death.dat1$id,death.dat1$famid1)

model4 
<-coxme(Surv(Survival,Event)~Sex+strata(Cohort)+SNP1+SNP2+SNP3,data=death.dat1,id|famid,varlist=list(deathdat1.kmat,famblockf.mat),pdcheck=FALSE)

to account for both family specific frailty effects and individual relatedness between 
families.

If someone could explain the error(s) to me/tell me what I'm doing wrong/suggest the right 
way, I would appreciate it as I'm fairly new to R.

Also, if someone could explain the difference between (id|famid) and (1+id|famid), I'd 
appreciate it. I did both of them with a test, and couldn't see much difference between them.

Thanks.