[R] Setting up hypothesis tests with the infer library?
Kevin Zembower
kev|n @end|ng |rom zembower@org
Mon Mar 31 18:26:25 CEST 2025
Neal, thanks so much for precisely focusing on the exact question I
asked. While I can follow along with the workings of your solution, I
never would have thought of it myself.
I'm glad to learn of all the other solutions offered, too. They
broadened my understanding of what I was doing in my statistics class
and how I was doing it in R. Thanks, all who contributed to this
thread.
Thanks, again, Neal, for your help.
-Kevin
On Sat, 2025-03-29 at 13:46 -0700, Neal Fultz wrote:
> >
> > I've been setting up problems like this with code similar to:
> > ===========================
> > df <- data.frame(
> > survey = c(rep("1980", 1000), rep("2010", 1000)),
> > DP = c(rep("Y", 0.66*1000), rep("N", 1000 - (0.66*1000)),
> > rep("Y", 0.64*1000), rep("N", 1000 - (0.64*1000))))
> >
>
>
> I think infer needs the data in 'long' format. Here is how I would
> approach
> it:
>
> df <- expand.grid(year=c(1980, 2010), DP=c("Approve", "Disapprove"))
>
> expand.grid creates all combos of year and view
>
> df <- df[ rep(1:4, c(660, 640, 1000-660, 1000-640)), ]
>
> here I used dataframe indexing to repeat each of the four rows. I
> find this
> a lot easier than manually creating columns the way you did.
> tidyverse-style packages don't encourage this style though. Tastes
> will
> vary.
>
> In dplyr style, add a column and use it in slice:
>
> df <- expand.grid(year=c(1980, 2010), DP=c("Approve", "Disapprove"))
> %>%
> mutate(freq = c(660, 640, 1000-660, 1000-640)) %>%
> slice(rep(1:n(), freq))
>
>
> xtabs(~year+DP, df)
>
> checks that it worked correctly. Or use dplyr::count() if that'swhat
> you're
> comfortable with.
>
> [[alternative HTML version deleted]]
>
>
More information about the R-help
mailing list