[R] an efficient way to calculate correlation matrix
Dennis Murphy
djmuser at gmail.com
Thu Jun 2 18:27:00 CEST 2011
?cor
Example:
> dd <- data.frame(x1 = rnorm(40), x2 = rnorm(40), x3 = runif(40, 0, 10))
'data.frame': 40 obs. of 3 variables:
$ x1: num -0.5585 1.3831 -1.7862 0.0572 0.2825 ...
$ x2: num -0.5247 -0.8636 -0.0749 0.2399 -0.1592 ...
$ x3: num 7.698 5.259 0.918 3.251 5.169 ...
> cor(dd)
x1 x2 x3
x1 1.0000000 -0.23268659 -0.02915700
x2 -0.2326866 1.00000000 -0.07073142
x3 -0.0291570 -0.07073142 1.00000000
It will also run on a matrix of numeric variables. Any factor or
character variables in the set of variables shipped to cor() will
cause an error; for example,
> head(Oats, 3)
Grouped Data: yield ~ nitro | Block
Block Variety nitro yield
1 I Victory 0.0 111
2 I Victory 0.2 130
3 I Victory 0.4 157
> cor(Oats)
Error in cor(Oats) : 'x' must be numeric
> cor(Oats[, 3:4])
nitro yield
nitro 1.0000000 0.6130266
yield 0.6130266 1.0000000
HTH,
Dennis
On Thu, Jun 2, 2011 at 8:48 AM, Bill Hyman <billhyman1 at yahoo.com> wrote:
> Dear all,
>
> I have a problem. I have m variables each of which has n observations. I want to
> calculate pairwise correlation among the m variables and store the values in a m
> x m matrix. It is extremely slow to use nested 'for' loops if m and n are large.
> Is there any efficient alternative to do this? Many thanks for your
> suggestions!!
>
> Bill
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list