[R] Descriptive Stats from Data Frame
Rich Shepard
rshepard at appl-ecosys.com
Tue Aug 30 23:00:13 CEST 2011
I don't find how to do what I need to do in Dalgaard or 'R Cookbook', so
I'm asking here.
I have a data frame with water chemistry data and I want to start
exploring these data. There are three factors (site, date, chemical)
associated with each measurement. The data frame looks like this:
> summary(chemdata)
site_id.sample_date.param.quant
BC-0.5|1996-04-19|Arsenic|0.01 : 1
BC-0.5|1996-04-19|Calcium|76.56 : 1
BC-0.5|1996-04-19|Chloride|12 : 1
BC-0.5|1996-04-19|Magnesium|43.23 : 1
BC-0.5|1996-04-19|Sulfate|175 : 1
BC-0.5|1996-04-19|Total Dissolved Solids|460: 1
(Other) :14880
I want first to calculate (and plot) descriptive stats by chemical,
ignoring site and date and telling R to ignore missing data. (Incorporating
those factors will occur later.) What I have not been able to figure out is
how to specify the command to, for example, calculate mean and sd for
Arsenic. My floundering and thrashing includes attempts like these:
> mean(chemdata.param="Arsenic")
Error in is.numeric(x) : 'x' is missing
> mean(chemdata.quant, param="Arsenic")
Error in mean(chemdata.quant, param = "Arsenic") :
object 'chemdata.quant' not found
> mean(chemdata$quant, param="Arsenic")
[1] NA
Warning message:
In mean.default(chemdata$quant, param = "Arsenic") :
argument is not numeric or logical: returning NA
As a newcomer to R I've done a lot of reading, yet all the examples use
nicely structured data to illustrate the point being made. I need to work
with my data and learn how to specify columns and write correct commands for
the analyses I need. Please point me in the right direction.
Rich
More information about the R-help
mailing list