[R] Consult a analysis problem.Thank you!

Mon Dec 5 08:50:12 CET 2005

On Mon, 5 Dec 2005, Ivy_Li wrote:

> Hello everybody,
> 	Could I consult you a question?

It you want free statistical consultancy, please use an informative 
signature giving your affiliation and credentials.

> 	I am doing an analysis about some data.

What was the aim of the analysis?

> I used Anova analysis. Its 
> PValue returned is about 0.275, no signal. But through the box-chart, I 
> think it exist discrepancy between A and B. And then I tried to use the 
> ks.test, fisher.test and var.test to do analysis. Their PValue returned 
> are all imperfect. If we think the PValue below 0.05 means it exist 
> significant. The all test result are all bigger than 0.1. I don't know 
> why the anova and other tests can not find out the issue? And could you 
> help me to find out which analysis method fit for this case? Thank you 
> very much!
>
> ##################R script
> 	#### -creat a data frame
> 	Value <- c(0.01592016, 0.05034839, 0.01810571, 0.05129173, 0.01557562, 0.04321186,
> 		0.01851016, 0.05214449, 0.01912795, 0.02081264, 0.05580136, 0.03097065,
> 		0.01706546, 0.01534989, 0.01367946, 0.01734044, 0.02419865, 0.04541759,
> 		0.08735891, 0.03297321, 0.02311511, 0.05972912, 0.04356657, 0.02234764,
> 		0.01291197, 0.02203159, 0.17550784, 0.08726857, 0.01557562, 0.04486457,
> 		0.01498870)
> 	Group <- c("A", "A", "B", "B", "B", "A", "B", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",
> 		"A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B")
> 	df.new <- data.frame(Value=Value,Group=Group)
> 	#### -plot the boxchart
> 	plot(as.factor(df.new$Group),df.new$Value)
> 	points(as.factor(df.new$Group), df.new$Value, pch=16,col=2)
> 	#### -anova test
> 	anova(lm(df.new$Value~as.factor(df.new$Group)))
> ###############end

You need to think about transforming your data.  However, a one-way 
two-class ANOVA is the same as a t-test with equal variances:

> t.test(Value ~ Group, data=df.new, var.equal=TRUE)

and clearly your two samples have different variances.

> var.test(Value ~ Group, data=df.new)$p.value
[1] 0.04595329

so

> t.test(Value ~ Group, data=df.new)

would be better.

So, you tested for a difference in mean assuming equal variances, but 
there exists a marginally difference in variances, and testing for a 
difference in mean no assuming equal variances is not significant.

Using 1/Value makes sense to equalize the variances.

You are doing things in R in a convoluted way.  Try simply

> boxplot(1/Value ~ Group, data=df.new)
> summary(aov(1/Value ~ Group, data=df.new))

Do also be aware of the dangers of multiple testing: it is invalid to 
choose the one you like out of several tests applied to a set of data. 
The bottom line is to collect more data.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595