[R] Running median

Martin Maechler maechler at stat.math.ethz.ch
Tue Aug 20 14:38:38 CEST 2002

>>>>> "DavidB" == David Brahm <brahm at alum.mit.edu>
>>>>>     on Mon, 19 Aug 2002 20:18:47 -0400 writes:

    DavidB> I have a Date x Stock (223 x 520) matrix of "trading
    DavidB> volume".  I can calculate a 5-day (past) average in
    DavidB> about 1 second using:

    R> apply(vol, 1, filter, filter=c(0, rep(1/5,5)), sides=1)

    DavidB> I would like to do the same with a 5-day median,
    DavidB> e.g.:

   R> mymed <- function(x, n=5) {
   R>   r <- rep(NA, length(x))
   R>   for (i in (n+1):length(x)) r[i] <- median(x[i-(1:n)])
   R>   return(r)
   R> }

   R> apply(vol, 1, mymed)

    DavidB> only faster (the above takes 65 seconds).  Is there
    DavidB> already a function (or some C code) to do this?  Any
    DavidB> clever way to vectorize it?

    DavidB> smooth() in package "eda" with kind="3" calculates a
    DavidB> running median of 3 values, so I may start with the
    DavidB> code in library/eda/src/smooth.c, but it doesn't
    DavidB> generalize easily to N values.  Also, decmedian() in
    DavidB> package "pastecs" may be relevant, but doesn't seem
    DavidB> any faster than my naive code.


On the "R Developer page", there are TODO lists of several R core members.  
Mine (http://developer.R-project.org/TODO-MM.html) contains an item

  MM> Running Medians for library(modreg) : 
  MM>   We need a fast robust smoother; lowess() is not robust.  
  MM> 	(a non-release of a package "runmed" is available; 
  MM> 	 need to write a short paper on what I found ....)

There two big steps making it much faster :
1) move to compiled code
2) use a smarter algorithm -- I have two versions :
   a. There is a first idea of making it fast which dates back to
      Friedman & Stuetzle's work on projection pursuit in the 80s.
   b. Computing `running median of k' for larger `k' :
      To make this really fast (asymptotically optimal), 
      Haerdle & Steiner published a paper on how to do this
      and Berwin Turlach has programmed it (or just translated
      their program).

Yes, as I say above, this should eventually become part of the
"modreg" package.

Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list