[R] correlation between categorical data

Sat Jan 24 02:54:48 CET 2015

Heinz Tuechler wrote
> At 07:40 21.06.2009, J Dougherty wrote:
> 
> [...]
>>There are other ways of regarding the FET.  Since it is precisely 
>>what it says
>>- an exact test - you can argue that you should avoid carrying over any
>>conclusions drawn about the small population the test was applied to and
>>employing them in a broader context.  In so far as the test is concerned,
the
>>"sample" data and the contingency table it is arrayed in are the entire
>>universe.  In that sense, the FET can't be "conservative" or "liberal." 
It
>>isn't actually a hypothesis test and should not be thought of as one or
used
>>in the place of one.
>> >
>>JDougherty
> 
> Could you give some reference, supporting this, for me, surprising 
> view? I don't see a necessary connection between an exact test and 
> the idea that it does not test a hypothesis.
> 
> Thanks,
> Heinz
> 
> ______________________________________________

> R-help@

>  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Fisher's Exact Test is a nonparametric "test."  It tests the distribution in
the contingency table against the total possible arrangements and gives you
the precise likelihood of that many items being arranged in that manner.  No
more and no less.  You could argue about the greater population from which
your sample is drawn, but FET makes no assumptions at all about any greater
sample universe.  Also, since the "population" being used in FET is strictly
limited to the members of the contingency table, the results are a subset of
a finite group of possible results that are relevant to that specific
arrangement of data.  You are not "estimating" parameters of a parent
population or making any assumptions about the parent distribution.  You can
designate a "p" value such as 0.05 as a level of significance, but there is
no "error" term in the FET result.  Fisher stated that the test DOES assume
a null hypothesis of independence to a hypergeometric distribution of the
cell members.  But that creates other issues if you are attempting to use
the results in conjunction with assumptions about a broader sample universe
than that in the test.  For instance you have to carry the assumption of a
hypergeometric distribution over in to the land of reality your sample is
drawn from and you then have to justify that.  

--
View this message in context: http://r.789695.n4.nabble.com/correlation-between-categorical-data-tp888975p4702235.html
Sent from the R help mailing list archive at Nabble.com.