[R] Use of geometric mean for geochemical concentrations
Leo Mada
|eo@m@d@ @end|ng |rom @yon|c@eu
Tue Jan 30 18:08:16 CET 2024
Dear Rich,
There are geochemical processes and biological processes that generate the compounds/metabolites/etc: nitrates, heavy metals, and other ions may be generated differently.
Such processes may differ based on season or some other event: rain, melting of ice, rare events like a spill from an artificial source.
I am not an expert in geochemistry or agriculture, but I would say that certain events happen occasionally and do not reflect the "baseline". Melting of ice in the mountains may increase the concentrations of certain ions. Application of fertilizers on the crops may increase some organic compounds. Algal blooms happen during summer.
Furthermore, the increase in concentration may be *non*-linear! This is the reason why a "robust" alternative (in this case the geometric mean) is used.
If you want specific details, then you would need to research the possible sources for each of the specific constituents and assess if those events release exponential (large) amounts of the compound of interest.
Sincerely,
Leonard
________________________________
From: Leo Mada <leo.mada using syonic.eu>
Sent: Tuesday, January 30, 2024 3:49 PM
To: Rich Shepard <rshepard using appl-ecosys.com>
Cc: r-help using r-project.org <r-help using r-project.org>; Richard O'Keefe <raoknz using gmail.com>
Subject: Re: [R] Use of geometric mean for geochemical concentrations
Dear Rich,
It depends how the data is generated.
Although I am not an expert in ecology, I can explain it based on a biomedical example.
Certain variables are generated geometrically (exponentially), e.g. MIC or Titer.
MIC = Minimum Inhibitory Concentration for bacterial resistance
Titer = dilution which still has an effect, e.g. serially diluting blood samples;
Obviously, diluting the samples will generate the following concentrations:
1, 1/2, 1.4, 1/8, 1/16, ...
(or the reciprocal: 1, 2, 4, 8, 16, ...)
It makes no sense to compute the arithmetic mean. Results are usually reported as some quantile (median or 90%); alternatively, one computes the geometric mean.
### Ecology /Environmental Chemistry
I suppose that certain chemicals may be generated/released in the environment through a non-linear process. The LLOD may also play a role, but may NOT be the main reason. If the generating process is exponential, then the arithmetic mean would strongly skew the results (also inconsistently based on season, particular year, etc - the generating processes may differ).
### Harmonic Mean
Did not encounter it often: maybe because of the problematic handling of 0.
I do have in the meantime a nice workaround for 0 (which also works with the geometric mean), see also (unfortunately not well documented):
https://github.com/discoleo/R/blob/master/Stat/Moments.Stat.R
v0 = 1; # some initial "skew"
1 /(xharm + v0) = sum( 1 / (x + v0) ) / length(x)
xgeom = prod(x + v0)^(1/length(x)) - v0;
I apologize for the late reply; I did not have much time to read messages during the past weeks.
Sincerely,
Leonard
[[alternative HTML version deleted]]
More information about the R-help
mailing list