[R] bootstrap CI of the difference between 2 Cramer's V

Daniel Nordlund djnord|und @end|ng |rom gm@||@com
Sat Jun 4 09:31:29 CEST 2022


On 5/28/2022 11:21 AM, varin sacha via R-help wrote:
> Dear R-experts,
>
> While comparing groups, it is better to assess confidence intervals of those differences rather than comparing confidence intervals for each group.
> I am trying to calculate the CIs of the difference between the two Cramer's V and not the CI to the estimate of each group’s Cramer's V.
>
> Here below my toy R example. There are error messages. Any help would be highly appreciated.
>
> ##############################
> library(questionr)
> library(boot)
>
> gender1<-c("M","F","F","F","M","M","F","F","F","M","M","F","M","M","F","M","M","F","M","F","F","F","M","M","M","F","F","M","M","M","F","M","F","F","F","M","M","F","M","F")
> color1<-c("blue","green","black","black","green","green","blue","blue","green","black","blue","green","blue","black","black","blue","green","blue","green","black","blue","blue","black","black","green","green","blue","green","black","green","blue","black","black","blue","green","green","green","blue","blue","black")
>
> gender2<-c("F","F","F","M","M","F","M","M","M","F","F","M","F","M","F","F","M","M","M","F","M","M","M","F","F","F","M","M","M","F","M","M","M","F","F","F","M","F","F","F")
> color2<-c("green","blue","black","blue","blue","blue","green","blue","green","black","blue","black","blue","blue","black","blue","blue","green","blue","black","blue","blue","black","black","green","blue","black","green","blue","green","black","blue","black","blue","green","blue","green","green","blue","black")
>
> f1=data.frame(gender1,color1)
> tab1<-table(gender1,color1)
> e1<-cramer.v(tab1)
>
> f2=data.frame(gender2,color2)
> tab2<-table(gender2,color2)
> e2<-cramer.v(tab2)
>
> f3<-data.frame(e1-e2)
>
> cramerdiff=function(x,w){
> y<-tapply(x[w,1], x[w,2],cramer.v)
> y[1]-y[2]
> }
>
> results<-boot(data=f3,statistic=cramerdiff,R=2000)
> results
>
> boot.ci(results,type="all")
> ##############################
>
>   
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

I don't know if someone responded offline, but if not, there are a 
couple of problems with your code.   First, the f3 dataframe is not what 
you think it is.  Second, your cramerdiff function isn't going to 
produce the results that you want.

I would put your data into a single dataframe with a variable 
designating which group data came from.  Then use that variable as the 
strata variable in the boot function to resample within groups.  So 
something like this:

f1 <- data.frame(gender=gender1,color=color1,group='grp1')
f2 <- data.frame(gender=gender2,color=color2,group='grp2')
f3 <- rbind(f1,f2)

cramerdiff <- function(x, ndx) {
    # calculate cramer.v for group 1 bootstrap sample
    g1 <-x[ndx,][x[,3]=='grp1',]
    cramer_g1 <- cramer.v(table(g1[,1:2]))
    # calculate cramer.v for group 2 bootstrap sample
    g2 <-x[ndx,][x[,3]=='grp2',]
    cramer_g2 <- cramer.v(table(g2[,1:2]))
    # calculate difference
    cramer_g1-cramer_g2
    }
# use strata parameter in function boot to resample within each group
results <- boot(data=f3,statistic=cramerdiff, 
strata=as.factor(f3$group),R=2000)
results
boot.ci(results)


Hope this is helpful,

Dan

-- 
Daniel Nordlund
Port Townsend, WA  USA


-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the R-help mailing list