[R] Bootstrapping and estimation of standard error

Tue May 26 19:35:45 CEST 2009

Hello,

I've started using R few months ago and I really like it.

I need to estimate standard deviation of certain statistics (some 
measures of poverty). I found a really simple program, and I just need 
to check whether it's OK and really calculates what it's supposed to.

Let's suppose e. g. head-count index (as one of the simplest measures of 
poverty) which is calculated as a proportion: q/n, where q is the number 
of poor and n is the size of population. This can be easily programmed 
e. g. as:

headcount <- function(x=1:10)
{
y <- x[x < 90000]
H <- (length(y)/length(x))
c(h_index = H)
}

There are probably also other ways how to do it, but I'm just the 
beginner :-) . (FYI: x is the vector of income data and 90000 is the 
poverty line).

Then one of possibilities how to estimate the standard deviation is 
bootstrapping. I found a simple program:
resamples.h <- lapply(1:1000, function(i) sample(size = 100, 
silc$prijem, replace = T))
r.headcount <- sapply(resamples.h, headcount)

Then it's easy to estimate S.E.

So my first question is whether this might be correct. Then I would like 
to ask, how the number of replications (in this case 1000) and size of 
the sample (samples of size 100, the real sample was about 5000) can 
influence the results. Or how can those two values be substantiated? Is 
it just subjective or are there any methods how to assess those values?

I really appreciate your time and help. Thanks a lot.

Tomas