[R] Survey package
Thomas Lumley
tlumley at u.washington.edu
Mon Sep 10 18:02:31 CEST 2007
On Sun, 9 Sep 2007, eugen pircalabelu wrote:
> A short example:
>
> stratum id weight nh Nh y sex
> 1 1 3 5 15 23 1
> 1 2 3 5 15 25 1
> 1 3 3 5 15 27 2
> 1 4 3 5 15 21 2
> 1 5 3 5 15 22 1
> 2 6 4 3 12 33 1
> 2 7 4 3 12 27 1
> 2 8 4 3 12 29 2
>
> where nh is size of sample stratum and Nh the corresponding population value, and y is metric variable.
>
> Now if i let
>
> design <- svydesign( id=~1, data=age, strata=~stratum, fpc=~Nh)
> then weights(design) gives me 3,3,3,3,3,4,4,4.
>
> If i then let
>
> x<- postStratify( design, strata=~sex, data.frame(sex=c("1","2"), freq=c(10,15)))
> the weights become
>
> 1 2 3 4 5 6 7 8
> 2.17 2.17 5.35 5.35 2.17 1.73 1.73 4.28
>
> If i define
>
> design <- svydesign( id=~1, data=age )
> x<- postStratify( design, strata=~sex, data.frame(sex=c("1","2"), freq=c(10,15)))
> weights become 2 2 5 5 2 2 2 5
>
> The question: does poststratify recognize that i have already stratified
> in the first design by stratum and then it post stratifies by sex? and
> why is that? (because i don't have the full joint distribution, the
> sex*stratum crossing, in order to apply correctly the post stratify
> function) I see that Mr Lumley uses the postStratify function when the
> design does not include strata (eg from ?poststratify:
>
This gives you a design stratified by stratum and post-stratified by sex,
which is not the same as stratifying by stratum*sex or post-stratifying by
stratum*sex.
In this case you should probably rake() on stratum and sex rather than
just post-stratifying. Post-stratifying on sex is equivalent to one iteration
of the iterative proportional fitting algorithm used in raking.
-thomas
More information about the R-help
mailing list