[R] Comparing multiple distributions
    Ravi Varadhan 
    rvaradhan at jhmi.edu
       
    Thu May 31 18:09:33 CEST 2007
    
    
  
Your data is "compositional data". The R package "compositions" might be
useful. You might also want to consult the book by J. Aitchison: statistical
analysis of compositional data.
Ravi.
----------------------------------------------------------------------------
-------
Ravi Varadhan, Ph.D.
Assistant Professor, The Center on Aging and Health
Division of Geriatric Medicine and Gerontology 
Johns Hopkins University
Ph: (410) 502-2619
Fax: (410) 614-9625
Email: rvaradhan at jhmi.edu
Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
----------------------------------------------------------------------------
--------
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of jiho
Sent: Thursday, May 31, 2007 11:37 AM
To: R-help
Subject: Re: [R] Comparing multiple distributions
Nobody answered my first request. I am sorry if I did not explain my  
problem clearly. English is not my native language and statistical  
english is even more difficult. I'll try to summarize my issue in  
more appropriate statistical terms:
Each of my observations is not a single number but a vector of 5  
proportions (which add up to 1 for each observation). I want to  
compare the "shape" of those vectors between two treatments (i.e. how  
the quantities are distributed between the 5 values in treatment A  
with respect to treatment B).
I was pointed to Hotelling T-squared. Does it seem appropriate? Are  
there other possibilities (I read many discussions about hotelling  
vs. manova but I could not see how any of those related to my  
particular case)?
Thank you very much in advance for your insights. See below for my  
earlier, more detailed, e-mail.
On 2007-May-21  , at 19:26 , jiho wrote:
> I am studying the vertical distribution of plankton and want to  
> study its variations relatively to several factors (time of day,  
> species, water column structure etc.). So my data is special in  
> that, at each sampling site (each observation), I don't have *one*  
> number, I have *several* numbers (abundance of organisms in each  
> depth bin, I sample 5 depth bins) which describe a vertical  
> distribution.
>
> Then let say I want to compare speciesA with speciesB, I would end  
> up trying to compare a group of several distributions with another  
> group of several distributions (where a "distribution" is a vector  
> of 5 numbers: an abundance for each depth bin). Does anyone know  
> how I could do this (with R obviously ;) )?
>
> Currently I kind of get around the problem and:
> - compute mean abundance per depth bin within each group and  
> compare the two mean distributions with a ks.test but this  
> obviously diminishes the power of the test (I only compare 5*2  
> "observations")
> - restrict the information at each sampling site to the mean depth  
> weighted by the abundance of the species of interest. This way I  
> have one observation per station but I reduce the information to  
> the mean depths while the actual repartition is important also.
>
> I know this is probably not directly R related but I have already  
> searched around for solutions and solicited my local statistics  
> expert... to no avail. So I hope that the stats' experts on this  
> list will help me.
>
> Thank you very much in advance.
JiHO
---
http://jo.irisson.free.fr/
-- 
Ce message a iti virifii par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a iti trouvi.
CRI UPVD http://www.univ-perp.fr
    
    
More information about the R-help
mailing list