[R] weighted mean and by() with two index
William Dunlap
wdunlap at tibco.com
Tue Apr 14 04:00:01 CEST 2009
Note that that output of by() is a matrix, but with some extra
attributes added to it.
Since you didn't supply any data I made up some that might
resemble yours.
> set.seed(1)
> re<-list(meta.sales.lkm=data.frame(pc=runif(40), sales=rpois(40,3),
size=sample(c("small","medium","large"),size=40,replace=TRUE),
yr=sample(1994:1998,size=40,replace=TRUE)))
I ran your by() call
> tmp<-by(re$meta.sales.lkm[, c("pc", "sales")],
re$meta.sales.lkm[, c("size", "yr")],
function(x) weighted.mean(x[,1], x[,2]))
and looked at it with dput and saw that all the usual
components of a matrix are in it
> dput(tmp)
structure(c(0.86969084572047, 0.687022846657783, 0.40032217082464,
0.125555095961317, 0.529131081343318, 0.64538708513137,
0.613078526553831,
0.663822646145351, 0.48206098045921, 0.333916208640273,
0.513083046752339,
NA, 0.457996427547187, 0.30292882991489, NA), .Dim = c(3L, 5L
), .Dimnames = structure(list(size = c("large", "medium", "small"
), yr = c("1994", "1995", "1996", "1997", "1998")), .Names =
c("size",
"yr")), call = by.data.frame(data = re$meta.sales.lkm[, c("pc",
"sales")], INDICES = re$meta.sales.lkm[, c("size", "yr")],
FUN = function(x) weighted.mean(x[, 1], x[, 2])), class = "by")
It is just the print method for 'by' objects that makes it look
different.
Since there is no special 'by' method for '[' you can use tmp[,] to view
the matrix part of it
> tmp[,]
yr
size 1994 1995 1996 1997 1998
large 0.8696908 0.1255551 0.6130785 0.3339162 0.4579964
medium 0.6870228 0.5291311 0.6638226 0.5130830 0.3029288
small 0.4003222 0.6453871 0.4820610 NA NA
If there were a [.by then you might have to manually remove the
"call" attribute and change the class to "matrix".
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
---------------------------------------
R] weighted mean and by() with two index
Dong H. Oh r.arecibo at gmail.com
Tue Apr 14 00:56:28 CEST 2009
Hi expeRts,
I would like to calculate weighted mean by two factors.
My code is as follows:
R> tmp <- by(re$meta.sales.lkm[, c("pc", "sales")],
re$meta.sales.lkm[, c("size", "yr")], function(x)
weighted.mean(x[,1], x[,2]))
The result is as follows:
R> tmp
size: micro
yr: 1994
[1] 1.090
------------------------------------------------------------
size: small
yr: 1994
[1] 1.135
------------------------------------------------------------
size: medium
yr: 1994
[1] 1.113
------------------------------------------------------------
size: large
yr: 1994
[1] 1.105
------------------------------------------------------------
size: micro
yr: 1995
[1] 1.167
------------------------------------------------------------
size: small
yr: 1995
[1] 1.096
------------------------------------------------------------
size: medium
yr: 1995
[1] 1.056
....
....
But the form I want to get is as follows:
1994 1995 1996 .....
micro 1.090 1.167 .............
small 1.135 1.096
medium 1.113 1.056 .... ........
large 1.105 ....... ...........
That is, the result should be tabularized.
How can I get the above form directly? (I don't want to modify tmp with
as.vector() and matrix() to get the result)
Thank you in advance.
------------------------------------------------------------------------
--
Donghyun Oh
CESIS, KTH
------------------------------------------------------------------------
--
[[alternative HTML version deleted]]
More information about the R-help
mailing list