[R] analyze summary data

Tue Jun 27 11:38:54 CEST 2006

Ben Bolker wrote:
> Thierry Girard <thierry.girard <at> unibas.ch> writes:
> 
>> I do have summary data (mean, standard deviation and sample size n)  
>> and want to analyze this data.
>> The summary data is supposed to be from a normal distribution.
>>
>> I need the following calculations on this summary data (no, I do not  
>> have the original data):
>>
>> - one sample t-test against a known mu
>> - two sample t-test
>> - analysis of variance between 4 groups.
>>
>> I would appreciate any help available.
>>
>> One possible solution could be to simulate the data using rnorm with  
>> the appropriate n, mu and sd, but I don't know if there would be a  
>> more accurate solution.
> 
> 
>   this is the kind of situation where you need to go back to the basics --
> knowing what computations these statistical tests are _actually
> doing_ -- which you should be able to find in any basic stats book, 
> or by digging
> into the guts of the R functions.  The only other thing you need to
> know is the R functions for cumulative distribution functions, pt
> (for the t distribution) and pf (for the F dist.)
> 
>   For example:
> 
>    stats:::t.test.default
> 
>  has lots of complicated stuff inside but the key lines are
> (for a one sample test)
> 
>  nx <- length(x)
>   df <- nx - 1
>   stderr <- sqrt(vx/nx)
>   # if you already have the standard deviation then you want
>   # sqrt(sd^2/nx)
>  tstat <- (mx - mu)/stderr   ## mu is the known mean you're testing against
>  pval <- 2 * pt(-abs(tstat), df)
> 
> (assuming 2-tailed)
> 
>   you will find similar stuff for the two-sample t-test,
> depending on your particular choices.
> 
>   The 1-way ANOVA might be harder to dig out of the R code;
> there you're better off going back and (re)learning from
> a basic stats treatment how to
> compute the between-group and (pooled) within-group variances.
> 
>   Bottom line is that, except for knowing about pt and pf,
> this is really a basic statistics question rather than an
> R question.
> 
>   good luck
>     Ben Bolker
> 
> PS: it is too bad, but the increasing sophistication of R is
> making it harder for beginners to explore the guts --- e.g.
> knowing to look for "stats:::t.test.default" in order to find
> the code ...

Thanks for the hint, I already had in mind writing an R Help Desk about 
"Finding the code" meaning both, R source code as described above as 
well as C code corresponding to the .Primitive, .C, .Call and friends' 
entry points.
Maybe for the next R News issue, if nobody is willing to contribute to 
the Help Desk column (hint, hint!!!).

Uwe Ligges

> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html