[R] Bug in t.test?

Johannes W. Dietrich j.w.dietrich at medizinische-kybernetik.de
Sat Aug 14 00:44:15 CEST 2010


Thank you for the fast reply! Although I have read the help page for 
t.test over and over again I have obviously overlooked the relevant 
sentence. The "workaround" that I have planned seems to be the 
correct use.

Thanks again,

J. W. D.

At 15:31 Uhr -0700 13.08.2010, Thomas Lumley wrote:
>Thanks for the clear example. However, if there is a bug it is only 
>that t.test.formula doesn't throw an error when given the 
>paired=TRUE option.
>
>The documentation says "The formula interface is only applicable for 
>the 2-sample tests.",  but there probably should be an explicit 
>check -- I didn't see that in the documentation until I realized 
>that t.test.formula couldn't work for paired tests because you don't 
>tell it which observations are paired.
>
>    -thomas
>
>
>On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:
>
>>Hello all,
>>
>>due to unexplained differences between statistical results from 
>>collaborators and our lab that arose in the same large proteomics 
>>dataset we reevaluated the t.test() function. Here, we found a 
>>weird behaviour that is also reproducible in the following small 
>>test dataset:
>>
>>Suppose, we have two vectors with numbers and some missing values 
>>that refer to the same individuals and that should therefore be 
>>evaluated with a paired t-test:
>>
>>>  testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
>>>  testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);
>>
>>Then
>>
>>>  print(t.test(testdata.A, testdata.B, paired=TRUE, 
>>>alternative="two.sided", na.action="na.pass"))
>>
>>and
>>
>>>  print(t.test(testdata.A, testdata.B, paired=TRUE, 
>>>alternative="two.sided", na.action="na.exclude"))
>>
>>deliver the same p value (0.1162, identical to Excel's result).
>>
>>However, after combining the two vectors with
>>
>>>  testdata <- c(testdata.A, testdata.B);
>>
>>and defining a criterion vector with
>>
>>>  criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);
>>
>>(that is the type of data layout we have in our proteomics project) 
>>we get a different p-value (0.01453) with
>>
>>>  print(t.test(testdata ~ criterion, paired=TRUE, 
>>>alternative="two.sided", na.action="na.exclude")) .
>>
>>The statement
>>
>>>  print(t.test(testdata ~ criterion, paired=TRUE, 
>>>alternative="two.sided", na.action="na.pass"))
>>
>>however, delivers a p-value of 0.1162 again.
>>
>>With
>>
>>>  print(t.test(testdata[criterion==0], testdata[criterion==1], 
>>>paired=TRUE, alternative="two.sided", na.action="na.exclude"))
>>
>>that imitates the first form, we get again a p value of 0.1162.
>>
>>What is the reason for the different p values? Should not all calls 
>>to t.test that exlude missing values be equivalent and therefore 
>>deliver the same results?
>>
>>Excel, StatView and KaleidaGraph all display a p-value of 0.1162.
>>
>>J. W. D.
>>--
>>-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>>-- Dr. Johannes W. Dietrich, M.D.
>>-- Laboratory XU44, Endocrine Research
>>-- Medical Hospital I, Bergmannsheil University Hospitals
>>-- Ruhr University of Bochum
>>-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
>>-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
>>-- eMail: "j.w.dietrich at medical-cybernetics.de"
>>-- WWW: http://medical-cybernetics.de
>>-- WWW: http://www.bergmannsheil.de
>>-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>>
>>______________________________________________
>>R-help at r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
>
>Thomas Lumley
>Professor of Biostatistics
>University of Washington, Seattle

-- 
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: "j.w.dietrich at medical-cybernetics.de"
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --



More information about the R-help mailing list