[R] CoxME: Family relatedness
Therneau, Terry M., Ph.D.
therneau at mayo.edu
Mon Sep 15 23:20:18 CEST 2014
I would have caught this tomorrow (I read the digest).
Some thoughts:
1. Skip the entire step of subsetting the death.kmat object. The coxme function knows how
to do this on its own, and is more likely to get it correct. My version of your code would be
deathdat.kmat <- 2* with(deathdat, makekinship(famid, id, faid, moid))
model3 <- coxme(Surv(Survival, Event) ~ Sex + strata(cohort) + SNP1 + SNP2 + SNP3 + (1|id),
data=death.dat, varlist=deathdat.kmat)
This all assumes that the "id" variable is unique. If family 3 and family 4 both have an
id of "1", then the coxme call can't match up rows in the data to rows/cols in the kinship
matrix uniquely. But that is simple to fix.
The kinship matrix K has .5 on the diagonal, by definition, but when used as a correlation
most folks prefer to use 2K. (This causes mixups since some software adds the "2" for
you, but coxme does not.)
2. The model above is the correct covariance structure for a set of families. There is a
single intercept per subject, with a complex correlation matrix. The simpler "per family"
frailty model would be
model4 <- coxme(Surv(Survival, Event) ~ Sex + strata(cohort) + SNP1 + SNP2 + SNP3 +
(1|famid), death.dat)
This model lets each family have a separate risk, with everyone in the same family sharing
the exact same risk. It is less general than model3 above which lets a family have higher
risk plus has variation between family members.
A model with both per-subject and per family terms is identical to one with a covariance
matrix of s1 K + s2 B, where K is the kinship matrix, B is a block diagonal matrix which
has a solid block of "1" for each family, and s1 s2 are the fitted variance coefficients.
I don't find this intuitive, but have seen the argument that "B" is a shared
environmental effect. (Perhaps because I have large family trees where they do not all
live together). If you want such a model:
model5 <- coxme(...... + (1|id) + (1|famid), death.dat, varlist=deathdat.kmat)
(When the varlist is too short the program uses the default for remaining terms).
3. To go further you will need to tell us what you are trying to model, as math formulas
not as R code.
4. The error messages you got would more properly read "I'm confused" on the part of the
program. They cases of something I would never do, so never got that message. Therefore
useful for me to see. Continuous variables go to the left of the | and categoricals to
the right of the |. Having a family id to the left makes no sense at all.
Terry Therneau
On 09/15/2014 03:20 PM, Marie Dogherty wrote:
> Dr. Therneau,
>
> I was wondering if you had a spare minute, if you could view a post in the R forum:
>
> http://r.789695.n4.nabble.com/CoxME-family-relatedness-td4696976.html
>
> I would appreciate it, I'm stuck and out of ideas!
>
> Many thanks
>
> Marie.
-----------Original post ----------
Hello all,
I have a table like this, with ~300 individuals:
Famid Id Faid Moid Cohort Sex Survival Event SNP1 SNP2 SNP3
1 1 0 0 0 0 10 1 0 1 0
2 2 0 0 1 1 20 1 0 0 0
2 3 0 0 0 0 25 1 0 1 0
4 5 1 2 0 0 35 1 0 1 1
4 6 1 2 0 0 35 0 1 0 1
famid=family id, id=id of person,faid=father id,moid=mother id.
My question of interest: What impact does SNP1/SNP2/SNP3 have on survival of individuals
(Id), after accounting for possible effects due to family relatedness (Famid).
So I want to account for family-specific frailty effects, and individual frailty effects
according to degree of relationship between individuals.
The commands I've used are from this paper: http://www.ncbi.nlm.nih.gov/pubmed/21786277
Library(survival)
Library(coxme)
Library(kinship2)
Library(bdsmatrix)
Death.dat <-read.table(“Table”,header=T)
#Make a kinship matrix for the whole study
deathdat.kmat
<-makekinship(famid=death.dat$famid,id=death.dat$id,father=death.dat$faid,mother=death.dat$moid)
##omit linker individuals with no phenotypic data, used only to ##properly specify the
pedigree when calculating kinship ##coefficints:
death.dat1<-subset(death.dat,!is.na(Survival))
##temp is an array with the indices of the individuals with Survival years:
all <-dimnames(deathdat.kmat)[[1]]
temp <-which(!is.na(death.dat$Survival[match(all,death.dat$id)]))
##kinship matrix for the subset with phenotype Survival:
deathdat1.kmat <-deathdat.kmat[temp,temp]
If I type:
model3
<-coxme(Surv(Survival,Event)~Sex+strata(Cohort)+SNP1+SNP2+SNP3+(1+id|famid),data=death.dat,varlist=deathdat1.kmat)
I get:
“Error in coxme(Surv(Survival, Event) ~ Sex + strata(Cohort) + SNP1 + :
In random term 1: Mlist cannot have both covariates and grouping”
Whereas if I type:
model3
<-coxme(Surv(Survival,Event)~Sex+strata(Cohort)+SNP1+SNP2+SNP3,(id|famid),data=death.dat1,varlist=deathdat1.kmat)
I get:
Error in coxme(Surv(Survival, Event) ~ Sex + strata(Cohort) + SNP1 + :
No observations remain in the data set
In addition: Warning message:
In Ops.factor(id, famid) : | not meaningful for factors
Eventually, I would like to, once I has this piece of code working, use:
famblockf.mat<-bdsBlock(death.dat1$id,death.dat1$famid1)
model4
<-coxme(Surv(Survival,Event)~Sex+strata(Cohort)+SNP1+SNP2+SNP3,data=death.dat1,id|famid,varlist=list(deathdat1.kmat,famblockf.mat),pdcheck=FALSE)
to account for both family specific frailty effects and individual relatedness between
families.
If someone could explain the error(s) to me/tell me what I'm doing wrong/suggest the right
way, I would appreciate it as I'm fairly new to R.
Also, if someone could explain the difference between (id|famid) and (1+id|famid), I'd
appreciate it. I did both of them with a test, and couldn't see much difference between them.
Thanks.
More information about the R-help
mailing list