[R] Post Stratification

Mark Hempelmann e.rehak at t-online.de
Sun Jun 18 22:06:20 CEST 2006


Dear WizaRds,

	having met some of you in person in Vienna,  I think even more fondly 
of this community and hope to continue on this route. It was great 
talking with you and learning from you. Thank you. I am trying to work 
through an artificial example in post stratification. This is my dataset:

library(survey)
age <- data.frame(id=1:8, stratum=rep( c("S1","S2"),c(5,3)), 
weight=rep(c(3,4),c(5,3)), nh=rep(c(5,3),c(5,3)), 
Nh=rep(c(15,12),c(5,3)), y=c(23,25,27,21,22, 77,72,74) )

pop.types <- table(stratum=age$stratum)
age.post <- svydesign(ids=~1, strata=NULL, data=age, fpc=~Nh) ## no 
clusters, no strata

post <- postStratify(design=age.post, strata=~stratum, population=pop.types)

svymean  (~y, post)
svytotal (~y, post)

gives
     mean     SE
y 42.625 0.5467
   total     SE
y   341 4.3737

So, is it correct to define pop.types as the number of elements sampled 
per stratum (nh) or rather the total of elements per stratum (Nh)? If so:

pop.types <- data.frame(stratum = c("S1","S2"), Freq = c(15, 12))
The help says: The 'population' totals can be specified as a table with 
the strata variables in the margins, or as a data frame where one 
column lists frequencies and the other columns list the unique 
combinations of strata variables. ??

However, I compute:
Nh=c(15,12); nh=c(5,3); sh=by(age$y, age$stratum, var); N=sum(Nh)
# Mean estimator
y.bar=by(age$y, age$stratum, mean) ## 23.6; 74.33
estimator=1/N*sum(Nh*y.bar) ## 46.14815
# Variance estimator
vari=1/N^2*sum(Nh*(Nh-nh)*sh/nh)
sqrt(vari) ##	.7425903

and with Taylor expansion .7750118

Please help me correct my mistakes. Thank you so much.
Yours
mark



More information about the R-help mailing list