[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
Gabriel Becker
gmbecker at ucdavis.edu
Fri Sep 9 21:15:08 CEST 2016
Martin et al.,
I seem to be in the minority here, so I won't belabor the point too much,
but one last response inline:
On Thu, Sep 8, 2016 at 11:51 PM, Martin Maechler <maechler at stat.math.ethz.ch
> wrote:
> Thank you, Gabe and Bill,
>
> for taking up the discussion.
>
> >>>>> William Dunlap <wdunlap at tibco.com>
> >>>>> on Thu, 8 Sep 2016 10:45:07 -0700 writes:
>
> > Prior to the mid-1990s, S did "length-0 OP length-n -> rep(NA, n)"
> and it
> > was changed
> > to "length-0 OP length-n -> length-0" to avoid lots of problems like
> > any(x<0) being NA
> > when length(x)==0. Yes, people could code defensively by putting
> lots of
> > if(length(x)==0)...
> > in their code, but that is tedious and error-prone and creates
> really ugly
> > code.
>
> Yes, so actually, basically
>
> length-0 OP <anything> -> length-0
>
> Now the case of NULL that Bill mentioned.
> I agree that NULL is not at all the same thing as double(0) or
> logical(0),
> *but* there have been quite a few cases, where NULL is the
> result of operations where "for consistency" double(0) / logical(0)
> should have
> been.... and there are the users who use NULL as the equivalent
> of those, e.g., by initializing a (to be grown, yes, very inefficient!)
> vector with NULL instead of with say double(0).
>
> For these reasons, many operations that expect a "number-like"
> (includes logical) atomic vector have treated NULL as such...
> *and* parts of the {arith/logic/relop} OPs have done so already
> in R "forever".
> I still would argue that for these OPs, treating NULL as logical(0) {which
> then may be promoted by the usual rules} is good thing.
>
>
> > Is your suggestion to leave the length-0 OP length-1 case as it is
> but make
> > length-0 OP length-two-or-higher an error or warning (akin to the
> length-2
> > OP length-3 case)?
>
> That's exactly what one thing the current changes eliminated:
> arithmetic (only; not logic, or relop) did treat the length-1
> case (for arrays!) different from the length-GE-2 case. And I strongly
> believe that this is very wrong and counter to the predominant
> recycling rules in (S and) R.
>
In my view, the recycling rules apply first and foremost to pairs of
vectors of lengths n,m >=1. And they can be semantically explained in that
case very easily: "the shorter, non-zero-length vector is rep'ed out to be
the length of the longer vector and then (generally) an element wise
operation takes place". The zero-length behavior already does not adhere to
this definition, as it would be impossible to do in the case of a
zero-length vector and a nonzero-length vector.
So the zero-length recycling behavior is already special-cased as I
understand it. In light of that, it seems that it would be allowable to
have different behavior based on the length of the other vector.
Furthermore, while I acknowledge the usefulness of the
x = numeric()
x < 5
case (i.e., the other vector is length 1), I can't come up with any use of,
e.g.,
y = numeric()
y < 3:5
That I can make any sense of other than as a violation of implicit
assumptions by the coder about the length of y.
Thus, I still think that should at *least* warn, preferably (imho) give an
error.
Best,
~G
--
Gabriel Becker, PhD
Associate Scientist (Bioinformatics)
Genentech Research
[[alternative HTML version deleted]]
More information about the R-devel
mailing list