[R] perform t.test by rows and columns in data frame
Kara Przeczek
przeczek at unbc.ca
Fri Feb 24 00:27:57 CET 2012
Sorry. I forgot to note that I am using R version 2.8.0.
________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] on behalf of Kara Przeczek [przeczek at unbc.ca]
Sent: February 23, 2012 3:13 PM
To: r-help at r-project.org
Subject: [R] perform t.test by rows and columns in data frame
Dear R Help,
I have been struggling with this problem without making much headway. I am attempting to avoid using a loop, and would appreciate any suggestions you may have. I am not well versed in R and apologize in advance if I have missed something obvious.
I have a data set with multiple sites along a river where metal concentrations were measured. Three sites are located upstream of a mine and three sites are located downstream of the mine. I would like to compare the upstream and downstream metal levels using a t-test.
The data set looks something like this (but with more metals (25) and sites (6):
TotalMetals Mean Site Location
Al 6000 1 us
Sb 0.6 1 us
Ba 150 1 us
Al 6500 2 us
Sb 0.7 2 us
Ba 160 2 us
Al 5600 3 ds
Sb 0.8 3 ds
Ba 180 3 ds
Al 170 4 ds
Sb 0.8 4 ds
Ba 175 4 ds
I have tried several variations of by() and aggregate() and tapply() without much luck. I thought I had finally got what I wanted with:
by(mr2$Mean, mr2$TotalMetals, function (x) t.test(mr2$Mean[mr2$Location=="us"], mr2$Mean[mr2$Location=="ds"]))
However, the output, although grouped by metal, had identical results for each metal with means for "x and y" equivalent to the mean of all metals within each site.
mean(mr2$Mean[mr2$Location=="us"]) #gave the x mean from the output and,
mean(mr2$Mean[mr2$Location=="ds"]) #gave the same y mean from the output
I can get the answer I want by performing the t-test for each metal individually with:
y=mr2[mr2$TotalMetals=="Al",]
t.test(y$Mean[y$Location=="us"], y$Mean[y$Location=="ds"])
But it would be painstaking to do this for each metal. In addition the data set will be getting larger in the future.
It would also be nice to collect the output in a table or similar format for easy output, if possible.
I would greatly appreciate any help that you could provide!
Thank you,
Kara
Natural Resources and Environmental Studies, MSc
University of Northern B.C.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list