[R] Multiple comparisons in a non parametric case

Spencer Graves spencer.graves at pdf.com
Wed Sep 8 01:16:21 CEST 2004

      Great summary, Rolf. 

      Just one minor issue that recently bit me:  In a data mining 
application with hundred of p-values, people want to make subtle 
distinctions based on extremely small p-values.  In such applications, 
even a modest amount of skewness (to say nothing of outliers) might have 
a surprising (and not necessarily monotonic) impact on p-values. 

      Best Wishes,
      Spencer Graves

Rolf Turner wrote:

>It looks to me like what you are doing is trying to judge
>significance of differences by non-overlap of single-sample
>confidence intervals.  While this is appealing, it's not quite
>I just looked into my copy of Applied Nonparametric Statistics
>(second ed.) by Wayne W. Daniel (Duxbury, 1990) but that
>only deals with the situation where there is a single replicate
>per block-treatment combination (whereas you have 10 reps)
>and block-treatment interaction is assumed to be non-existent.
>The method that Daniel prescribes in this simple setting seems to be
>no more than applying the Bonferroni method of multiple comparisons.
>(Daniel does not say; his book is very much a cook-book.)  So you
>might simply try Bonferroni --- i.e. do all k-choose-2 pairwise
>comparisons between treatments (using the appropriate 2 sample method
>for each comparison) doing each comparison at the alpha/k-choose-2
>significance level.  Where k = the number of treatments = 4 in your
>case.  This method is not going to be super-powerful but it is
>sometimes surprizing how well Bonferroni stacks up against more
>``sophisticated'' methods.
>Daniel gives a reference to ``Nonparametric Statistical Methods'' by
>Myles Hollander and Douglas A. Wolfe, New York, Wiley, 1973, for ``an
>alternative multiple comparisons formula''.  I don't have this book,
>and don't know what direction Hollander and Wolfe ride off in, but it
>***might*** be worth trying to get your hands on it and see.
>Finally --- in what way are the assumptions of Anova violated?  The
>conventional wisdom is that Anova is actually quite robust to
>non-normality.  Particularly when the sample size is large --- and 10
>reps per treatment combination is pretty good.  Heteroskedasticity is
>more of a worry, but it's not so much of a worry when the design is
>nicely balanced.  As yours is.  And finally-finally --- have you
>tried transforming your data to make them a bit more normal and/or
>I hope this is some help.
>				cheers,
>					Rolf Turner
>					rolf at math.unb.ca
>Marco Chiarandini wrote:
>>I am conducting a full factorial analysis. I have one factor
>>consisting in algorithms, which I consider my treatments, and another
>>factor made of the problems I want to solve. For each problem I
>>obtain a response variable which is stochastic. I replicate the
>>measure of this response value 10 times.
>>When I apply ANOVA the assumptions do not hold, hence I must rely on
>>non parametric tests.
>>By transforming the response data in ranks, the Friedman test tells
>>me that there is statistical significance in the difference of the
>>sum of ranks of at least one of the treatments.
>>I would like now to produce a plot for the multiple comparisons
>>similar to the Least Significant Difference or the Tukey's Honest
>>Significant Difference used in ANOVA. Since I am in the non
>>parametric case I can not use these methods.
>>Instead, I compare graphically individual treatments by plotting the
>>sum of ranks of each treatment togehter with the 95% confidence
>>interval. To compute the interval I use the Friedman test as
>>suggested by Conover in "Practical Nonparametric statistics".
>>I obtain something like this:
>>Treat. A                |-+-|
>>Treat. B              |-+-|
>>Treat. C                   |-+-|
>>Treat. D           |-+-|
>>The intervals have all the same spread because the number of
>>replications was the same for all experimental units.
>>I would like to know if someone in the list had a similar experience
>>and if what I am doing is correct. In alternative also a reference to
>>another list which could better fit my request is welcome.
>R-help at stat.math.ethz.ch mailing list
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Spencer Graves, PhD, Senior Development Engineer
O:  (408)938-4420;  mobile:  (408)655-4567

More information about the R-help mailing list