[R] how to correlate nominal variables?
Daniel Malter
daniel at umd.edu
Tue Aug 25 18:41:16 CEST 2009
I updated the previously posted function for Cramer's V so that it
automatically prints Cramer's V, chi-square, the degrees of freedom, and the
significance level of Cramer's V based on the chi-square value and the
degrees of freedom with desired (user-supplied) levels of precision. An
example is included.
cramers.v=function(x,digits){
x=as.data.frame(x)
chisq=0
v=NULL
row.sum=NULL
col.sum=NULL
row.sum=rowSums(table(x))
col.sum=colSums(table(x))
for(k in 1:dim(table(x))[1]){
for(l in 1:dim(table(x))[2]){
chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1]))
v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1)))
dfs=(dim(table(x))[1]-1)*(dim(table(x))[2]-1)
sig=1-pchisq(chisq,dfs)
}
}
result=data.frame(round(v,digits[1]),round(chisq,digits[2]),round(dfs,digits[3]),round(sig,digits[4]))
names(result)=c("Cramer's V","Chi-square","DFs","p-value")
print(result)
}
##Example
#Create correlated a and b
a=rnorm(100)
e=rnorm(100)
b=a+e
#Split a and b into quartile categories
a=cut(a,breaks=quantile(a),include.lowest=TRUE,labels=FALSE)
b=cut(b,breaks=quantile(b),include.lowest=TRUE,labels=FALSE)
#Cross-tabulate a and b
table(a,b)
#Compute Cramer's V, Chi-square, degrees of freedom and significance
#supply the (maximum) number of digits you want for each value
cramers.v(data.frame(a,b),digits=c(3,2,0,5))
--
View this message in context: http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p25137957.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list