[R] bootstrap confidence intervals with previously existing bootstrap sample
Tim Hesterberg
timh at insightful.com
Wed Sep 5 17:40:21 CEST 2007
>On Tue, 4 Sep 2007, dolph008 at umn.edu wrote:
>> I am new to R. I would like to calculate bootstrap confidence intervals
>> using the BCa method for a parameter of interest. My situation is this: I
>> already have a set of 1000 bootstrap replicates created from my original
>> data set. I have already calculated the statistic of interest for each
>> bootstrap replicate, and have also calculated the mean for this statistic
>> across all the replicates. Now I would like to calculate Bca confidence
>> intervals for this statistic. Is there a way to import my
>> previously-calculated set of 1000 statistics into R, and then calculate
>> bootstrap confidence intervals around the mean from this imported data?
>>
>> I have found the code for boot.ci in the manual for the boot package, but
>> it looks like it requires that I first use the "boot" function, and then
>> apply the output to "boot.ci". Because my bootstrap samples already exist,
>> I don't want to use "boot", but just want to import the 1000 values I have
>> already calculated, and then get R to calculate the mean and Bca confidence
>> intervals based on these values. Is this possible?
Brian Ripley wrote:
>Yes, it is possible but you will have to study the internal structure of
>an object of class "boot" (which is documented on the help page) and mimic
>it. You haven't told us which type of bootstrap you used, which is one of
>the details you need to supply.
>
>It might be slightly easier to work with function bcanon in package
>bootstrap, which you would need to edit to suit your purposes.
>
>I don't know why you have picked on the BCa method: my experience is that
>if you need to correct the basic method you often need far more than 1000
>samples to get reliable results.
You can do the BCa, but you need to supply parameters:
z0: typically calculated from the fraction of bootstrap statistics
that are <= the original statistic
acceleration: based on the skewness of the empirical influence function,
typically calculated using the jackknife
I agree that you should do far more than 1000 samples. The BCa uses
bootstrap quantiles that are adjusted based on the z0 and acceleration
parameters, and estimating z0 from the bootstrap samples magnifies
the Monte Carlo error. You need roughly double as many bootstrap samples
as for the bootstrap percentile interval; e.g. 10^4 instead of 5000.
If computational expense is an issue, you might prefer bootstrap
tilting intervals, which require about 1/37 as many bootstrap samples
as the BCa for comparable Monte Carlo variability.
Quick overview of confidence intervals:
accuracy comments
t intervals 1/sqrt(n) Using either formula or bootstrap
standard error; poor in the presence
of skewness.
bootstrap percentile 1/sqrt(n) Good quick-and-dirty procedure.
Partial skewness correction.
Poor if the statistic is biased.
bootstrap t 1/n Good coverage, but interval
width can vary wildly when n is small.
BCa 1/n Current best overall, but you need
a lot of bootstrap samples, e.g. 10^4.
tilting 1/n Low Monte Carlo variability, so can
use fewer bootstrap samples.
Difficult to implement, and
requires that statistic can be
calculated with weights.
Advertisement 1: tilting is available in S+Resample, available free
from www.insightful.com/downloads/libraries
Advertisement 2: I talk about these more in my short course,
Bootstrap Methods and Permutation Tests
Oct 10-11 San Francisco, 3-4 Oct UK.
http://www.insightful.com/services/training.asp
========================================================
| Tim Hesterberg Senior Research Scientist |
| timh at insightful.com Insightful Corp. |
| (206)802-2319 1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax) Seattle, WA 98109-3044, U.S.A. |
| www.insightful.com/Hesterberg |
More information about the R-help
mailing list