[R] svymean using the survey package - strata containing no subpopulation members
Anthony Damico
ajdamico at gmail.com
Thu Jun 23 05:54:38 CEST 2016
hi pradip, with meps you should be able to match precisely between r+survey
and those other languages[1]
if i had to guess, i would say that your sas and stata code is actually
doing the equivalent of this, which is not correct. check the journal
article table #1 for syntax comparisons
design <- svydesign(id=~varpsu,strat=~varstr, weights=~perwt13f,
data=subset(x, racethx==4 & diabdx==1), nest=TRUE)
svymean(~ totexp13, design)
#Error in onestrat(x[index, , drop = FALSE], clusters[index],
nPSU[index][1], :
# Stratum (1004) has only one PSU at stage 1
more detailed discussion of lonely psu behavior at [2] including how to
override this error -- but i think it comes from faulty design
specification so should not be necessary? thanks
[1] https://journal.r-project.org/archive/2009-2/RJournal_2009-2_Damico.pdf
[2] http://faculty.washington.edu/tlumley/old-survey/exmample-lonely.html
On Wed, Jun 22, 2016 at 9:32 PM, Muhuri, Pradip (AHRQ/CFACT) <
Pradip.Muhuri at ahrq.hhs.gov> wrote:
> Hi,
>
> Below is a reproducible example that produces the estimate of "totexp13"
> (total health care expenditure 2013) for the subpopulation that includes
> "Asians with diabetes diagnosed" in MEPS. The R script below downloads file
> from the web for processing.
>
> Issue/Question: The R/survey package does not seem to provide a NOTE
> regarding the number of strata containing NO SUBOPOPULATION MEMBERS (in
> this case - Asians with diabetes diagnosed in MEPS 2013). Is there a way to
> get this count or ask R to provide this information? Any hints will be
> appreciated.
>
> Acknowledgements: The current R script is a tweaked-version of the code
> originally sent (on this forum) by Anthony Damico for another application.
> Thanks to Anthony!
>
> Good news: The estimate is almost the same as the estimates obtained from
> SAS, SUDAAN and STATA runs.
>
> Additional Information: STATA provides a NOTE that " 84 strata omitted
> because no subpopulation members".
> SAS LOG (proc surveymeans) provides a NOTE that "Only one cluster in a
> stratum in domain Asian_with_diab for variable(s) TOTEXP13. The estimate of
> variance for TOTEXP13 will
> omit this stratum".
>
>
> Thanks,
>
> Pradip Muhuri
>
>
>
>
> library(foreign)
> library(survey)
> library(dplyr)
>
> tf <- tempfile()
>
> download.file( "https://meps.ahrq.gov/mepsweb/data_files/pufs/h163ssp.zip",
> tf , mode = 'wb' )
>
> z <- unzip( tf , exdir = tempdir() )
>
> x <- read.xport( z )
>
> names( x ) <- tolower( names( x ) )
>
> mydata <- select(x, varstr, varpsu, perwt13f, diabdx, totexp13, racethx)
>
> mydata[mydata <=0] <- NA
>
> design <- svydesign(id=~varpsu,strat=~varstr, weights=~perwt13f, data=x,
> nest=TRUE)
>
>
> # include missings as "No" values here
> #design <-
> # update(design,
> # xbpchek53 = ifelse(bpchek53 ==1,'yes','no or missing'),
> # xcholck53 = ifelse(cholck53 ==1, 'yes','no or missing')
> #)
>
> # get the estimate for "totexp13" for the subset that includes Asians with
> diabetes diagnosed
> svymean(~ totexp13, subset(design, racethx==4 & diabdx==1))
>
> Pradip K. Muhuri, AHRQ/CFACT
> 5600 Fishers Lane # 7N142A, Rockville, MD 20857
> Tel: 301-427-1564
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list