[R] comparing strength of association instead of strength of evidence?

Kjetil Brinchmann Halvorsen kjetil at acelerate.com
Fri Jun 24 21:27:28 CEST 2005


Weiwei Shi wrote:

>Hi,
>I asked this question before, which was hidden in a bunch of
>questions. I repharse it here and hope I can get some help this time:
>
>I have 2 contingency tables which have the same group variable Y. I
>want to compare the strength of association between X1/Y and X2/Y. I
>am not sure if comparing p-values IS the way  even though the
>probability of seeing such "weird" observation under H0 defines
>p-value and it might relate to the strength of association somehow.
>But I read the following statement from Alan Agresti's "An
>Introduction to Categorical Data Analysis" :
>"Chi-squared tests simply indicate the degree of EVIDENCE for an
>association....It is sensible to decompose chi-squared into
>components, study residuals, and estimate parameters such as odds
>ratios that describe the STRENGTH OF ASSOCIATION".
>
>  
>
Here are some things you can do:

 > tab1<-array(c(11266, 125, 2151526, 31734), dim=c(2,2))

 > tab2<-array(c(43571, 52, 2119221, 31807), dim=c(2,2))
 > library(epitools) # on CRAN
 > ?odds.ratio
Help for 'odds.ratio' is shown in the browser
 > library(help=epitools) # on CRAN
 > tab1
      [,1]    [,2]
[1,] 11266 2151526
[2,]   125   31734
 > odds.ratio(11266, 125, 2151526, 31734)
Error in fisher.test(tab) : FEXACT error 40.
Out of workspace.                 # so this are evidently for tables 
with smaller counts
 > library(vcd) # on CRAN

 > ?oddsratio
Help for 'oddsratio' is shown in the browser
 > oddsratio( tab1)  # really is logodds ratio
[1] 0.2807548
 > plot(oddsratio( tab1) )
 > library(help=vcd) # on CRAN  Read this for many nice functions.
 > fourfoldplot(tab1)
 > mosaicplot(tab1)     # not really usefull for this table

Also has a look at function Crosstable in package gmodels.

To decompose the chisqure you can program yourselves:

decomp.chi <- function(tab) {
        rows <-  rowSums(tab)
        cols <-   colSums(tab)
        N <-   sum(rows)
         E <- rows %o% cols / N
         contrib <- (tab-E)^2/E
         contrib }


 > decomp.chi(tab1)
          [,1]         [,2]
[1,] 0.1451026 0.0007570624
[2,] 9.8504915 0.0513942218
 >

So you can easily see what cell contributes most to the overall chisquared.

Kjetil





>Can I do this "decomposition" in R for the following example including
>2 contingency tables?
>
>  
>
>>tab1<-array(c(11266, 125, 2151526, 31734), dim=c(2,2))
>>tab1
>>    
>>
>      [,1]    [,2]
>[1,] 11266 2151526
>[2,]   125   31734
>
>  
>
>>tab2<-array(c(43571, 52, 2119221, 31807), dim=c(2,2))
>>tab2
>>    
>>
>      [,1]    [,2]
>[1,] 43571 2119221
>[2,]    52   31807
>
>
>BTW, is there some good forum on the theory of statistics? r-help is a
>good one but I don't want to bother people by asking some questions
>weakly associated with R here.
>
>Thanks,
>
>  
>


-- 

Kjetil Halvorsen.

Peace is the most effective weapon of mass construction.
               --  Mahdi Elmandjra





-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.




More information about the R-help mailing list