[R] Error in principal component loadings calculation
David L Carlson
dcarlson at tamu.edu
Mon Sep 14 23:07:31 CEST 2015
The sum of the squared loadings will always sum to 1 because they are standardized by dividing them by the standard deviation of each component. The terminology for principal components is not as consistent as we could hope. What princomp() calls loadings is really the structure matrix (the correlation between each variable and the component). The pattern matrix (often called the loadings) are the regression coefficients for computing the principal component scores. You are probably looking for the pattern matrix which is easy to obtain by multiplying by the standard deviations:
> set.seed(42)
> data <- matrix(runif(100), 20, 5)
> pc <- princomp(data, cor=TRUE)
> loadings(pc)
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
[1,] 0.638 0.249 -0.260 -0.679
[2,] -0.714 0.449 0.298 -0.444
[3,] 0.585 -0.152 0.522 -0.231 0.555
[4,] -0.617 -0.543 -0.564
[5,] -0.496 0.154 0.479 -0.687 -0.172
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
SS loadings 1.0 1.0 1.0 1.0 1.0
Proportion Var 0.2 0.2 0.2 0.2 0.2
Cumulative Var 0.2 0.4 0.6 0.8 1.0
> rowSums(pc$loadings^2)
[1] 1 1 1 1 1
> # Notice that the column sums of the squared loadings all equal 0
> # Now multiply each loading by its standard deviation
> sweep(pc$loadings, 2, pc$sdev, "*")
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
[1,] 0.765 0.275 -0.237 -0.531
[2,] -0.787 0.427 0.271 -0.347
[3,] 0.701 -0.167 0.497 -0.211 0.434
[4,] -0.680 -0.518 -0.515
[5,] -0.594 0.169 0.456 -0.627 -0.134
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
SS loadings 1.436 1.215 0.907 0.832 0.611
Proportion Var 0.287 0.243 0.181 0.166 0.122
Cumulative Var 0.287 0.530 0.712 0.878 1.000
> pc$sdev^2
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
1.4362072 1.2145055 0.9068555 0.8315685 0.6108632
> # Now the sum of the squared loadings equals the
> # squared standard deviation (aka the eigenvalues)
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Marcelo Kittlein
Sent: Monday, September 14, 2015 8:46 AM
To: r-help at r-project.org
Subject: [R] Error in principal component loadings calculation
Hi all
I have been using "princomp" to obtain the principal components of some
data and find that the loadings returned by the function appear to have
some error.
in a simple example if a calculate de pc for a random matrix I get that
all loadings for the different components have the same proportion of
variance
data <- matrix(runif(100), 20, 5)
pc <- princomp(data, cor=TRUE)
loadings(pc)
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
[1,] -0.280 0.510 0.674 -0.217 -0.400
[2,] 0.529 -0.353 -0.694 -0.330
[3,] -0.111 0.563 -0.713 -0.336 -0.222
[4,] -0.530 -0.502 -0.178 0.140 -0.645
[5,] -0.590 -0.215 -0.582 0.516
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
SS loadings 1.0 1.0 1.0 1.0 1.0
Proportion Var 0.2 0.2 0.2 0.2 0.2
Cumulative Var 0.2 0.4 0.6 0.8 1.0
This keep returning the same proportion of variance for each component
regardless of the data used.
my R version is
> R.Version()
$platform
[1] "x86_64-unknown-linux-gnu"
$arch
[1] "x86_64"
$os
[1] "linux-gnu"
$system
[1] "x86_64, linux-gnu"
$status
[1] ""
$major
[1] "3"
$minor
[1] "2.1"
$year
[1] "2015"
$month
[1] "06"
$day
[1] "18"
$`svn rev`
[1] "68531"
$language
[1] "R"
$version.string
[1] "R version 3.2.1 (2015-06-18)"
$nickname
[1] "World-Famous Astronaut"
some hint would be much appreciated.
Best regards
Marcelo Kittlein
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list