[R] Question about Survey Package

Thomas Lumley tlumley at u.washington.edu
Mon Oct 10 17:58:25 CEST 2005


On Mon, 10 Oct 2005, Real Miranda Rigoberto wrote:

> I have a question referring to the calculation of variance estimation of 
> the survey package
>
> I need to estimate the variance for different Domains but for a 
> stratified sampling desing in several stages. Särndal et al (1992), CAP 
> 10, makes reference to this problem.
>
> My question is if it is possible by means of "survey package" to obtain 
> these calculations, and if it follows the methodology raised by Särndal 
> or another author.
>

Yes, it is possible.

The computations for totals are based on the use of domain indicator 
variables when computing variances, as in Sarndal et al (1992), and the 
handling of multistage sampling is as in chapter 4.4 of that book. The 
computations for statistics other than totals are based on estimating the 
total of a suitable estimating function and then solving the estimating 
equation.

In fact, for domain means there are three equivalent ways to compute the 
estimator and its variance, and one of the package tests checks that these 
give the same answer

With the data set from example(mu284) we could compute the mean for the 
completely artificial domain id2>1 by
     svymean(~y1, subset(dmu284, id2>1))
The subset() function knows how to handle survey designs to give correct domain 
estimation.

This is equivalent to two more obviously correct estimators based on the 
whole sample: a regression estimator
     summary(svyglm(y1~factor(id2>1)+0, design=dmu284)
and to a ratio estimator
     svyratio(~as.numeric(y1*(id2>1)), ~as.numeric(id2>1), design=dmu284)

All three give the same mean estimator and standard error.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle


More information about the R-help mailing list