[R] Setting up hypothesis tests with the infer library?

Kevin Zembower kev|n @end|ng |rom zembower@org
Mon Mar 31 18:26:25 CEST 2025


Neal, thanks so much for precisely focusing on the exact question I
asked. While I can follow along with the workings of your solution, I
never would have thought of it myself.

I'm glad to learn of all the other solutions offered, too. They
broadened my understanding of what I was doing in my statistics class
and how I was doing it in R. Thanks, all who contributed to this
thread.

Thanks, again, Neal, for your help.

-Kevin

On Sat, 2025-03-29 at 13:46 -0700, Neal Fultz wrote:
> > 
> > I've been setting up problems like this with code similar to:
> > ===========================
> > df <- data.frame(
> >     survey = c(rep("1980", 1000), rep("2010", 1000)),
> >     DP = c(rep("Y", 0.66*1000), rep("N", 1000 - (0.66*1000)),
> >            rep("Y", 0.64*1000), rep("N", 1000 - (0.64*1000))))
> > 
> 
> 
> I think infer needs the data in 'long' format. Here is how I would
> approach
> it:
> 
> df <- expand.grid(year=c(1980, 2010), DP=c("Approve", "Disapprove"))
> 
> expand.grid creates all combos of year and view
> 
> df <- df[ rep(1:4, c(660, 640, 1000-660, 1000-640)),  ]
> 
> here I used dataframe indexing to repeat each of the four rows. I
> find this
> a lot easier than manually creating columns the way you did.
> tidyverse-style packages don't encourage this style though. Tastes
> will
> vary.
> 
> In dplyr style, add a column and use it in slice:
> 
> df <- expand.grid(year=c(1980, 2010), DP=c("Approve", "Disapprove"))
> %>%
>     mutate(freq = c(660, 640, 1000-660, 1000-640)) %>%
>     slice(rep(1:n(), freq))
> 
> 
> xtabs(~year+DP, df)
> 
> checks that it worked correctly. Or use dplyr::count() if that'swhat
> you're
> comfortable with.
> 
> 	[[alternative HTML version deleted]]
> 
> 





More information about the R-help mailing list