[R] Multiple Imputation in mice/norm
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Sat Apr 25 15:25:19 CEST 2009
Emmanuel Charpentier wrote:
> Le vendredi 24 avril 2009 à 14:11 -0700, ToddPW a écrit :
>> I'm trying to use either mice or norm to perform multiple imputation to fill
>> in some missing values in my data. The data has some missing values because
>> of a chemical detection limit (so they are left censored). I'd like to use
>> MI because I have several variables that are highly correlated. In SAS's
>> proc MI, there is an option with which you can limit the imputed values that
>> are returned to some range of specified values. Is there a way to limit the
>> values in mice?
>
> You may do that by writing your own imputation function and assign them
> for the imputation of particular variable (see argument
> "imputationMethod" and details in the man page for "mice").
>
>> If not, is there another MI tool in R that will allow me to
>> specify a range of acceptable values for my imputed data?
>
> In the function amelia (package "Amelia"), you might specify a "bounds"
> argument, which allows for such a limitation. However, be aware that
> this might destroy the basic assumption of Amelia, which is that your
> data are multivariate normal. Maybe a change of variable is in order (e.
> g. log(concentration) has usually much better statistical properties
> than concentration).
>
> Frank Harrell's aregImpute (package Hmisc) has the "curtail" argument
> (TRUE by default) which limits imputations to the range of observed
> values.
>
> But if your left-censored variables are your dependent variables (not
> covariates), may I suggest to analyze these data as censored data, as
> allowed by Terry Therneau's "coxph" function (package "survival") ? code
> your "missing" data as such a variable (use :
> coxph(Surv(min(x,<yourlimit>,na.rm=TRUE),
> !is.na(x),type="left")~<Yourmodel>) to do this on-the-fly).
>
> Another possible idea is to split your (supposedly x) variable in two :
> observed (logical), and value (observed value if observed, <detection
> limit> if not) and include these two data in your model. You probably
> will run into numerical difficulties due to the (built-in total
> separation...).
>
> HTH,
>
> Emmanuel Charpentier
>
>> Thanks for the help,
>> Todd
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
All see
@Article{zha09non,
author = {Zhang, Donghui and Fan, Chunpeng and Zhang,
Juan and Zhang, {Cun-Hui}},
title = {Nonparametric methods for measurements below
detection limit},
journal = Stat in Med,
year = 2009,
volume = 28,
pages = {700-715},
annote = {lower limit of detection;left censoring;Tobit
model;Gehan test;Peto-Peto test;log-rank test;Wilcoxon test;location
shift model;superiority of nonparametric methods}
}
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list