[R] a < b < c is alway TRUE

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Sat Jul 7 12:35:14 CEST 2001

Duncan Murdoch <dmurdoch at pair.com> writes:

> As a general principle, code should be easy to read and understand by
> the target audience.  If your audience is C programmers, then it's
> fine to write junk like "if (1) ...", but if your audience is
> statisticians, it's better if what you write is closer to standard
> mathematical syntax.  Since 0 and 1 are numbers, not logical values,
> it would be better if they were treated as such in R.  Since "3 < 2 <
> 1" has the interpretation "false" in standard mathematical notation,
> if R is to accept it, it would be better if it had the same value in
> R.
> HOWEVER, R is already old, and is essentially S, which is very old.  
> "3 < 2 < 1" has a well-defined meaning in R.  Changing it now would be
> a bad thing.  Adding warnings (or even a lint-like utility to check
> through R source code) for constructions like this which are likely
> sources of bugs would be a good thing, but there are a lot of good
> things to do, and only a finite amount of time for people to do them.
> Adding this wouldn't  be at the top of my priority list.

(S is not *that* old (New S, 1988), but it does build on principles of
parser construction that are a decade or more older.)

The general principle is that rules should be clear and consistent!
You don't get anywhere by assuming that computer scientists don't know
what they are doing and trying to change rules to suite a naive
audience. Hardly any programming language allows the 1 < x < 2
construct. Either they throw an error or they do something unexpected
(but logical) due to type coercion. (Also, I don't think any versions
of *formal* mathematical logic would allow that notation, relational
operators are binary.) 

Whenever you try to make a language do something "intuitive" you also
find gotchas lurking inside, and of a different sort than what you
could figure out by considering the rules as they are, instead of what
you expect them to be.

Consider the following:

x <- rep(NA,10)
x < 0

This will work fine and the result of the comparison is a vector of
NA's. However, NA is a logical constant, so if you tried to prevent
comparisons of logicals to numericals, it would bomb. These all-NA
vectors can creep in all over the place, so you'd get to pepper large
portions of your code with checks like

if (is.logical(x) && all(is.na(x))) rep(NA,length(x)) else x < 0

(There have been countless of these isues coming up over the years,
and most often, once the full implications of a change of semantics
has been realized, they have ended with a "perhaps Uncle John was
right after all"...)

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list