[R] Aggregating data (with more than one function)
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Tue Mar 29 03:59:21 CEST 2005
In the Arguments section of help(aggregate), you will find :
FUN: a scalar function to compute the summary statistics which can
be applied to all data subsets.
a) So you can try the 'by' function :
> by( df[ , 3], df$Department, function(x) c(mean(x), sum(x)) )
INDICES: Finance
[1] 83925.67 251777.00
------------------------------------------------------------
INDICES: HR
[1] 63333.33 190000.00
------------------------------------------------------------
INDICES: IT
[1] 59928.67 179786.00
------------------------------------------------------------
INDICES: Sales
[1] 62481.67 187445.00
b) or use tapply more directly :
> tmp <- tapply(df$Salary, df$Department, function(x)
c( mean(x), sum(x) ) )
$Finance
[1] 83925.67 251777.00
$HR
[1] 63333.33 190000.00
$IT
[1] 59928.67 179786.00
$Sales
[1] 62481.67 187445.00
And using the 'sapply( tmp, c )' gives you a slightly more compact
output as
Finance HR IT Sales
[1,] 83925.67 63333.33 59928.67 62481.67
[2,] 251777.00 190000.00 179786.00 187445.00
Regards, Adai
On Mon, 2005-03-28 at 19:15 -0600, Sivakumaran Raman wrote:
> I have the data similar to the following in a data frame:
> LastName Department Salary
> 1 Johnson IT 56000
> 2 James HR 54223
> 3 Howe Finance 80000
> 4 Jones Finance 82000
> 5 Norwood IT 67000
> 6 Benson Sales 76000
> 7 Smith Sales 65778
> 8 Baker HR 56778
> 9 Dempsey HR 78999
> 10 Nolan Sales 45667
> 11 Garth Finance 89777
> 12 Jameson IT 56786
>
> I want to calculate both the mean salary broken down by Department and
> also the
> total amount paid out per department i.e. I want both sum(Salary) and
> mean(Salary) for each Department. Right now, I am using aggregate.data.frame
> twice, creating two data frames, and then combining them using data.frame.
> However, this seems to be very memory and processor intensive and is
> taking a
> very long time on my data set. Is there a quicker way to do this?
>
> Thanks in advance,
> Siv Raman
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list