[R] Kolmogorov-Smirnov statistic

Gad Abraham gabraham at csse.unimelb.edu.au
Sat Aug 29 16:02:50 CEST 2009


More of a statistical question, I'm trying to understand the formulation 
of the one-sample two-sided Kolmogorov-Smirnov statistic in 
stats::ks.test(), testing against a uniform distribution.

Basically, it boils down to:

x <- rnorm(100)

n <- length(x)
z <- punif(sort(x)) - (0:(n - 1)) / n
max(z, 1 / n - z)

which is equivalent to the textbook definition

n <- length(x)
z <- punif(sort(x))
Dplus <- max(sapply(1:n, function(i) i / n - z[i]))
Dminus <- max(sapply(1:n, function(i) z[i] - (i - 1) / n))
max(Dplus, Dminus)

(See, e.g., 
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm, and 
Durbin (1971) ``Distribution theory for tests based on the sample 
distribution function'', p. 6)

Why does the definition of Dminus have an i-1 in the numerator instead 
of i? I have a hunch it's got to do with right-continuity of the ecdf, 
but perhaps someone can shed some light on it.


Gad Abraham
MEng Student, Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham

More information about the R-help mailing list