[R] testing whether clusters in a PCA plot are significantly different from one another

Marchesi, Julian j.marchesi at imperial.ac.uk
Mon Jan 9 09:18:23 CET 2017

Dear Micheal

So I would be much better off just reporting the PCA as is and conclude what i can from plot



Julian R. Marchesi

Deputy Director and Professor of Clinical Microbiome Research at the  Centre for Digestive and Gut Health, Imperial College London, London W2 1NY Tel: +44 (0)20 331 26197


Professor of Human Microbiome Research at the School of Biosciences, Museum Avenue, Cardiff University, Cardiff, CF10 3AT, Tel: +44 (0)29 208 74188, Fax: +44 (0)29 20874305, Mobile 07885 569144

From: Michael Friendly <friendly at yorku.ca>
Sent: 07 January 2017 17:15
To: Marchesi, Julian; 'r-help at r-project.org'
Subject: Re: testing whether clusters in a PCA plot are significantly different from one another

Significance tests for group differences in a MANOVA of
lm(cbind(pc1, pc2) ~ group)

will get you what you want, but you are advised DON'T DO THIS, at least
without a huge grain of salt and a slew of mea culpas.
Otherwise, you are committing p-value abuse and contributing to the
notion that significance tests must be used to justify all conclusions.

The p-values will not be correct under standard normal theory of the
multivariate GLM because the pc1 and pc2 were chosen to optimize
the variance accounted for by their linear combinations and there
is no theory that can correct for this, AFAIK.  The cluster "group"
assignment was also chosen to optimize some (other) criterion.

Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

More information about the R-help mailing list