[R] Who to decide what a generic function should look like?

Thu Feb 20 03:09:03 CET 2003

I am not sure if what I am asking below should be discussed under r-help
or r-devel, so please feel free to move over to r-devel. 

This is a spin off from Gordon Smyth's question about generic functions
and Robert Gentleman's reply. I have tried to raise the question before
and I am sure this has been discussed by others, but never on the r-help
list what I can see. My concern is that generic functions as defined
today are only semi-generic. From ?Methods the definition of a generic
function is:

     "A generic function is a function that has associated with it a
     collection of other functions (the methods), all of which agree in
     formal arguments with the generic."

For me a generic function should be fully generic in the sense that
there are no requirements of arguments agreement (and therefore it
should not be documented as a reply to Smyth's thread). Under the
S3/UseMethod scheme as well S4/methods scheme this requirement is
enforced (even though one can get around it by using only arguments
"...") and under S4/methods it is followed even more strictly. I
understand that by enforcing matching arguments (and argument types) the
method dispatching mechanism can work much faster. Are there any another
purposes than efficiency behind the argument matching requirement? Why
not make a generic function truly generic? Having truly generic
functions, the method dispatching mechanism could equally well be done
by the language interpreter itself and thus making generic functions
obsolete.

My concern is that enforcing methods to match the argument signature of
the generic function will make packages incompatible with each other. I
can not create a generic function called "normalize" for my microarray
package and expect it to work together with other package defining a
generic function with the same name. Some short-term and long-term
outcomes from this are:

  * Each developer who cares has to come up with a less general name
than "normalize", e.g. 
    "normalizeLowess". However, this will still not guarantee you that
there will not be any 
    naming conflict with other, to you unknown or future upcoming,
packages. People will have 
    to create extremely awkward method names to make there generic
functions unique. This is
    already happening today (and unfortunately everyone invents his/her
own naming rules).

  * This in turn will result in an object-oriented programming style
that looks like a 
    procedural programming style and the gain/idea of having
polymorphism and the possibility 
    of overloading methods will disappear. This can also be seen in some
packages.

  * If you do not follow the approach of having unique method names, but
still want to keep your 
    package compatible with other, you will have to change your API
constantly, which hurt the 
    end user who has to update there scripts accordingly. This will also
result in unnecessary 
    troubleshooting and bug fixes.

  * Trying to learn object-oriented programming by using R will be
confusing, resulting in 
    procedural/object-oriented hybrids. This can also be seen today.

Does anyone agree with this and what are the thoughts about this?

So,

  * who is the person to decide what a generic function should look
like, and 
  * who owns the right to the method name "normalize"?

Best wishes

Henrik Bengtsson

Home: 201/445 Royal Parade, 3052 Parkville
Office: Bioinformatics, WEHI, Parkville
+61 (0)412 269 734 (cell), +61 (0)3 9345 2324 (lab),
+1 (508) 464 6644 (global fax)
hb at wehi.edu.au, http://www.maths.lth.se/~hb/
Time zone: +11h UTC (Sweden +1h UTC, Calif. -8h UTC)