[R] Q and R mode in Principal Component Analysis
William Revelle
lists at revelle.net
Wed Sep 7 00:01:05 CEST 2011
At 4:10 PM +0100 9/6/11, Lívio Cipriano wrote:
>Hi,
>
>Can anyone explain me the differences in Q and R mode in Principal Component
>Analysis, as performed by prcomp and princom respectively.
Dear Livio,
The help file of prcomp says it pretty well:
"The calculation is done by a singular value
decomposition of the (centered and possibly
scaled) data matrix, not by using eigen on the
covariance matrix. This is generally the
preferred method for numerical accuracy. "
with the help file from princomp:
princomp only handles so-called R-mode PCA, that
is feature extraction of variables. If a data
matrix is supplied (possibly via a formula) it is
required that there are at least as many units as
variables. For Q-mode PCA use prcomp.
This R and Q (as well as S and T) terminology was
introduced (at least in psychology) by Ray
Cattell in his discussion of the "Data Box". It
is the idea that you can consider three
dimensions of data (across subjects, variables,
and time). Then there are six different ways to
cut up the data. A typical data matrix has rows
for observations and columns for variables.
Typically the number of rows >> columns. If you
are trying to find a structure that reduces the
complexity of the variables, you do the normal
analysis (R) of the variables. An alternative is
do the analysis on the transpose of the data
matrix (Q analysis). That is, to try to reduce
the complexity of the rows.
This is not a problem if you do aingular value
decomposition (which is what prcomp does). It
can be if you do a princomp analysis which is
based upon the covariance of the data.
Let nXv represent your original matrix. (n
observations on v variables). For an R analysis,
using princomp, you are finding the principal
components of the covariance matrix C which is of
size v x v with rank = the lesser of n and v. But
for a Q analysis, if you are using princomp, you
are still trying to find the principal components
of a covariance matrix C* which has dimensions n
x n but has a rank of the lesser of n and v.
That is, if the number of rows > number of
columns the rank of the covariance matrix of the
transposed matrix will still be the number of
columns although the size of the correlation
matrix will be n x n.
Q analysis is looking for patterns of similarity
in the subjects over variables, R analysis is
looking for similarity in the variables over
subjects. This then gets generalized to the case
of subjects over time, variables, over time, ....
"The data box emphasized that we are not limited
to correlating tests over people at one time. In
its 1946 formulation, there were six 'designs of
covariation using literal measurement' and 12
'designs of covariation using differential or
ratio measurement' (Cattell, 1946c, p 94-95).
Considering Persons, Tests, and Occasions as the
fundamental dimensions, it was possible to
generalize the normal correlation of Tests over
Persons design (R analysis) to consider how
Persons correlated over Tests (Q analysis), or
Tests over Occasions (P analysis), etc. Cattell
(1966) extended the data box's original three
dimensions to five by adding Background or
preceding conditions as well as Observers (see
also Cattell (1977)). Applications of the data
box concept have been seen throughout psychology,
but the primary influence has probably been on
those who study personality development and
change over the life span (McArdle & Bell, 2000,
Mroczek, 2007, Nesselroade, 1984). Unfortunately,
even for the original three dimensions, Cattell
(1978) used a different notation than he did in
Cattell (1966, 1977) or Cattell (1946b)."
British Journal of Psychology (2009), 100, 253-257
q 2009 The British Psychological Society
[1] R. B. Cattell. The data box: Its ordering
of total resources in terms of possible
relational systems. In R. B. Cattell, editor,
Handbook of multivariate experimental psychology,
pages 67-128. Rand-McNally, Chicago, 1966.
I suspect this is more than you wanted to know.
Bill
>
>Regards
>
>Lívio Cipriano
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list