[R] Unweighted meta-analysis
Emmanuel Charpentier
charpent at bacbuc.dyndns.org
Mon Nov 26 21:15:00 CET 2007
Roy Sanderson a écrit :
> Hello
>
> I'm very much a beginner on meta-analysis, so apologies if this is a
> trivial posting. I've been sent a set data from separate experimental
> studies, Treatment and Control, but no measure of the variance of effect
> sizes, numbers of replicates etc. Instead, for each study, all I have
> is the mean value for the treatment and control (but not the SD).
With possibly three very special kind of exceptions, what you've been
sent is insufficient for any kind of analysis (meta- or otherwise) : no
way to assess variability, hence no way to assess relative importance of
noise to data or relative importance of different set of data.
One possible exception is when the very nature of the experiment imply
that your data come from a truly one-parameter distribution. I'm
thinking, for example, of count data of rare events, which might, under
not-so-unreasonable(-in-special-circumstances) conditions, come from a
Poisson distribution.
Another possible set of exception is that when the second (and
following) parameter(s) can be deduced from "obvious" general knowledge.
For example (set in a semi-imaginary setup), one may give you the number
of people using a given service at least once during a specified period,
*provided* that in order to use this service, people have to register
with the service provider first. The data might be a simple number (no
valid indication of variability, if service use is too ferquent to be
modeled by a Poisson distribution), but looking up the number of
registered users in some data bank might provide you with a valid
proportion and population size, which is enough to meta-analyze.
But the third possibility is of course that your "means" are indeed the
result of experimenting on *ONE* experimental unit of each group. This
is highly dependent of what is measured and how (an example of this
might be industrial production per unit time with two different set of
tools/machines in various industrial setups : here, the experimental
unit is the industrial setup, and your "mean" is *one* measure of
speed). Then, you have *individual* data, that you should analyze
accordingly (e. g. t-test or Wilcoxon test if there is no relationship
between "treated" and "control" experimental unit, paired t-test or
paired Wilcoxon test if you are told that the "means" may be related,
etc ...). This is not a "meta-analysis", but an analysis.
Outside these possibilities, I see no point of "meta-analysing" anything
that isn't analysable by itself.
> As
> far as I can tell, this forces me into an unweighted meta-analysis, with
> all the caveats and dangers associated with it.
As far as I can tell, you're forced to tell your sponsor/tutor/whatever
either that he doesn't know what he asks for or that he's trying to
fool you (and you saw it !) ; which might lead you to ask him to rethink
his question, give you more informatin about the measumements and
experimental setup, to provide you (or help you find) the missing data,
to stop playing silly games or to go fly a kite...
> Two possible approaches
> might be:
>
> a) Take the ln(treatment/control) and perform a Fisher's randomisation
> test (and also calculate +/- CI).
> b) Regress the treatment vs control values, then randomise (with or
> without replacement?) individual values, comparing the true regression
> coefficient with the distribution of randomisation regression
> coefficients.
I haven't the foggiest idea of what you're trying to do here :
introducing artficial variability in order to separate it for variation
between groups ?
Unless you are in the case of "one experiment = one experimental unit
per group" (see above) with no information about variability, the only
information you can use is the *sign* of the difference
"Experimental"-"Control" : if all or "almost all" of them go "in the
same direction", one might be tempted to conclude that this direction is
not random (that's the sign test) . But this is only valid if the
hypothesis "Direction is 50/50 random under H0" has validity under the
experimental setup, which your story doesn't tell...
> Both approaches would appear to be fraught with risks; for example in
> the regression approach, it is probable that the error distribution of
> an individual randomised regression might not be normal - would this
> then invalidate the whole set of regressions?
Again, you'd work against an artificial variability that you'd have
introduced yourself : what is the point ?
HTH,
Emmanuel Charpentier
More information about the R-help
mailing list