[R] dixon test
Fernando Marmolejo-Ramos
fernando.marmolejoramos at adelaide.edu.au
Wed Aug 13 00:15:02 CEST 2008
hi giov
about the dixon test... i just run a simple test with a sample of 40 and I
got:
Error in dixon.test(x) : Sample size must be in range 3-30
So it seems that most of the test in the "outliers" package are designed for
small samples. See also the Rnews article published in May 2006 (vol 6/2)
called "processing data for outliers" by Lukasz Komsta (the developer of the
package).
However there is in that package a function called "scores" which works for
big samples. You can also see the p-values and z scores for the observations
you have and determine which values are considered outliers.
Try this simple syntax:
library(outliers)
library(gamlss.dist)
# this produces a exponential+Gaussian distribution (which usually has heaps
of outliers!)
x <- rexGAUS(100,2000,3000,5000)
# this confirms that Dixon works for samples between 3 and 30!!!
dixon.test(x)
# just to see what the data set looks like and visually confirm the outliers
boxplot(x, notch=T)
# sort the scores in ascending order
sort(x)
# returns probability of each score (using z scores) to be an outlier in
order
sort(scores(x, type="z", prob=1))
# determines which scores are considered outliers with a 95% confidence
sort(scores(x, prob=0.95))
The author points regarding the "prob" part...
prob ---- If set, the corresponding p-values instead of scores are given. If
value is set to 1, p-value are returned. Otherwise, a logical vector is
formed, indicating which values are exceeding specified probability. In "z"
and "mad" types, there is also possibility to set this value to zero, and
then scores are confirmed to (n-1)/sqrt(n) value, according to Shiffler
(1998). The "iqr" type does not support probabilities, but "lim" value can
be specified.
The reference of Shiffler is not as the one that appears in the help. It is
this one:
Schiffler, R.E (1988). Maximum Z scores and outliers. Am. Stat. 42, 1,
79-80.
I hope this helps,
Fernando
--
View this message in context: http://www.nabble.com/dixon-test-tp18940260p18953571.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list