[R] Bootstrapping and estimation of standard error
Tomas Zelinsky
tomas.zelinsky at tuke.sk
Tue May 26 19:35:45 CEST 2009
Hello,
I've started using R few months ago and I really like it.
I need to estimate standard deviation of certain statistics (some
measures of poverty). I found a really simple program, and I just need
to check whether it's OK and really calculates what it's supposed to.
Let's suppose e. g. head-count index (as one of the simplest measures of
poverty) which is calculated as a proportion: q/n, where q is the number
of poor and n is the size of population. This can be easily programmed
e. g. as:
headcount <- function(x=1:10)
{
y <- x[x < 90000]
H <- (length(y)/length(x))
c(h_index = H)
}
There are probably also other ways how to do it, but I'm just the
beginner :-) . (FYI: x is the vector of income data and 90000 is the
poverty line).
Then one of possibilities how to estimate the standard deviation is
bootstrapping. I found a simple program:
resamples.h <- lapply(1:1000, function(i) sample(size = 100,
silc$prijem, replace = T))
r.headcount <- sapply(resamples.h, headcount)
Then it's easy to estimate S.E.
So my first question is whether this might be correct. Then I would like
to ask, how the number of replications (in this case 1000) and size of
the sample (samples of size 100, the real sample was about 5000) can
influence the results. Or how can those two values be substantiated? Is
it just subjective or are there any methods how to assess those values?
I really appreciate your time and help. Thanks a lot.
Tomas
More information about the R-help
mailing list