[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
William Dunlap
wdunlap at tibco.com
Thu Sep 8 19:05:33 CEST 2016
Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)? This is
a different case than a zero-length operand of a legitimate type. E.g.,
any(x < 0)
should return FALSE if x is number-like and length(x)==0 but give an error
if x is NULL.
I.e., I think the type check should be done before the length check.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
> Martin,
>
> Like Robin and Oliver I think this type of edge-case consistency is
> important and that it's fantastic that R-core - and you personally - are
> willing to tackle some of these "gotcha" behaviors. "Little" stuff like
> this really does combine to go a long way to making R better and better.
>
> I do wonder a bit about the
>
> x = 1:2
>
> y = NULL
>
> x < y
>
> case.
>
> Returning a logical of length 0 is more backwards compatible, but is it
> ever what the author actually intended? I have trouble thinking of a case
> where that less-than didn't carry an implicit assumption that y was
> non-NULL. I can say that in my own code, I've never hit that behavior in a
> case that wasn't an error.
>
> My vote (unless someone else points out a compelling use for the behavior)
> is for the to throw an error. As a developer, I'd rather things like this
> break so the bug in my logic is visible, rather than propagating as the
> 0-length logical is &'ed or |'ed with other logical vectors, or used to
> subset, or (in the case it should be length 1) passed to if() (if throws an
> error now, but the rest would silently "work").
>
> Best,
> ~G
>
> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
> maechler at stat.math.ethz.ch>
> wrote:
>
> > >>>>> robin hankin <hankin.robin at gmail.com>
> > >>>>> on Thu, 8 Sep 2016 10:05:21 +1200 writes:
> >
> > > Martin I'd like to make a comment; I think that R's
> > > behaviour on 'edge' cases like this is an important thing
> > > and it's great that you are working on it.
> >
> > > I make heavy use of zero-extent arrays, chiefly because
> > > the dimnames are an efficient and logical way to keep
> > > track of certain types of information.
> >
> > > If I have, for example,
> >
> > > a <- array(0,c(2,0,2))
> > > dimnames(a) <- list(name=c('Mike','Kevin'),
> > NULL,item=c("hat","scarf"))
> >
> >
> > > Then in R-3.3.1, 70800 I get
> >
> > a> 0
> > > logical(0)
> > >>
> >
> > > But in 71219 I get
> >
> > a> 0
> > > , , item = hat
> >
> >
> > > name
> > > Mike
> > > Kevin
> >
> > > , , item = scarf
> >
> >
> > > name
> > > Mike
> > > Kevin
> >
> > > (which is an empty logical array that holds the names of the people
> > and
> > > their clothes). I find the behaviour of 71219 very much preferable
> > because
> > > there is no reason to discard the information in the dimnames.
> >
> > Thanks a lot, Robin, (and Oliver) !
> >
> > Yes, the above is such a case where the new behavior makes much sense.
> > And this behavior remains identical after the 71222 amendment.
> >
> > Martin
> >
> > > Best wishes
> > > Robin
> >
> >
> >
> >
> > > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
> > maechler at stat.math.ethz.ch>
> > > wrote:
> >
> > >> >>>>> Martin Maechler <maechler at stat.math.ethz.ch>
> > >> >>>>> on Tue, 6 Sep 2016 22:26:31 +0200 writes:
> > >>
> > >> > Yesterday, changes to R's development version were committed,
> > >> relating
> > >> > to arithmetic, logic ('&' and '|') and
> > >> > comparison/relational ('<', '==') binary operators
> > >> > which in NEWS are described as
> > >>
> > >> > SIGNIFICANT USER-VISIBLE CHANGES:
> > >>
> > >> > [.............]
> > >>
> > >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
> > >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
> > >> > behave consistently, notably for arrays of length zero.
> > >>
> > >> > Arithmetic between length-1 arrays and longer non-arrays had
> > >> > silently dropped the array attributes and recycled. This
> > >> > now gives a warning and will signal an error in the future,
> > >> > as it has always for logic and comparison operations in
> > >> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and
> > >> > ‘matrix(1,1) < 2:3’).
> > >>
> > >> > As the above "visually suggests" one could think of the changes
> > >> > falling mainly two groups,
> > >> > 1) <0-extent array> (op) <non-array>
> > >> > 2) <1-extent array> (arith) <non-array of length != 1>
> > >>
> > >> > These changes are partly non-back compatible and may break
> > >> > existing code. We believe that the internal consistency gained
> > >> > from the changes is worth the few places with problems.
> > >>
> > >> > We expect some package maintainers (10-20, or even more?) need
> > >> > to adapt their code.
> > >>
> > >> > Case '2)' above mainly results in a new warning, e.g.,
> > >>
> > >> >> matrix(1,1) + 1:2
> > >> > [1] 2 3
> > >> > Warning message:
> > >> > In matrix(1, 1) + 1:2 :
> > >> > dropping dim() of array of length one. Will become ERROR
> > >> >>
> > >>
> > >> > whereas '1)' gives errors in cases the result silently was a
> > >> > vector of length zero, or also keeps array (dim & dimnames) in
> > >> > cases these were silently dropped.
> > >>
> > >> > The following is a "heavily" commented R script showing (all ?)
> > >> > the important cases with changes :
> > >>
> > >> > ------------------------------------------------------------
> > >> ----------------
> > >>
> > >> > (m <- cbind(a=1[0], b=2[0]))
> > >> > Lm <- m; storage.mode(Lm) <- "logical"
> > >> > Im <- m; storage.mode(Im) <- "integer"
> > >>
> > >> > ## 1. -------------------------
> > >> > try( m & NULL ) # in R <= 3.3.x :
> > >> > ## Error in m & NULL :
> > >> > ## operations are possible only for numeric, logical or complex
> > >> types
> > >> > ##
> > >> > ## gives 'Lm' in R >= 3.4.0
> > >>
> > >> > ## 2. -------------------------
> > >> > m + 2:3 ## gave numeric(0), now remains matrix identical to m
> > >> > Im + 2:3 ## gave integer(0), now remains matrix identical to Im
> > >> (integer)
> > >>
> > >> > m > 1 ## gave logical(0), now remains matrix identical to
> Lm
> > >> (logical)
> > >> > m > 0.1[0] ## ditto
> > >> > m > NULL ## ditto
> > >>
> > >> > ## 3. -------------------------
> > >> > mm <- m[,c(1:2,2:1,2)]
> > >> > try( m == mm ) ## now gives error "non-conformable arrays",
> > >> > ## but gave logical(0) in R <= 3.3.x
> > >>
> > >> > ## 4. -------------------------
> > >> > str( Im + NULL) ## gave "num", now gives "int"
> > >>
> > >> > ## 5. -------------------------
> > >> > ## special case for arithmetic w/ length-1 array
> > >> > (m1 <- matrix(1,1,1, dimnames=list("Ro","col")))
> > >> > (m2 <- matrix(1,2,1, dimnames=list(c("A","B"),"col")))
> > >>
> > >> > m1 + 1:2 # -> 2:3 but now with warning to "become ERROR"
> > >> > tools::assertError(m1 & 1:2)# ERR: dims [product 1] do not match
> > the
> > >> length of object [2]
> > >> > tools::assertError(m1 < 1:2)# ERR: (ditto)
> > >> > ##
> > >> > ## non-0-length arrays combined with {NULL or double() or ...}
> > *fail*
> > >>
> > >> > ### Length-1 arrays: Arithmetic with |vectors| > 1 treated
> array
> > >> as scalar
> > >> > m1 + NULL # gave numeric(0) in R <= 3.3.x --- still, *but* w/
> > >> warning to "be ERROR"
> > >> > try(m1 > NULL) # gave logical(0) in R <= 3.3.x --- an
> *error*
> > >> now in R >= 3.4.0
> > >> > tools::assertError(m1 & NULL) # gave and gives error
> > >> > tools::assertError(m1 | double())# ditto
> > >> > ## m2 was slightly different:
> > >> > tools::assertError(m2 + NULL)
> > >> > tools::assertError(m2 & NULL)
> > >> > try(m2 == NULL) ## was logical(0) in R <= 3.3.x; now error as
> > above!
> > >>
> > >> > ------------------------------------------------------------
> > >> ----------------
> > >>
> > >>
> > >> > Note that in R's own 'nls' sources, there was one case of
> > >> > situation '2)' above, i.e. a 1x1-matrix was used as a "scalar".
> > >>
> > >> > In such cases, you should explicitly coerce it to a vector,
> > >> > either ("self-explainingly") by as.vector(.), or as I did in
> > >> > the nls case by c(.) : The latter is much less
> > >> > self-explaining, but nicer to read in mathematical formulae, and
> > >> > currently also more efficient because it is a .Primitive.
> > >>
> > >> > Please use R-devel with your code, and let us know if you see
> > >> > effects that seem adverse.
> > >>
> > >> I've been slightly surprised (or even "frustrated") by the empty
> > >> reaction on our R-devel list to this post.
> > >>
> > >> I would have expected some critique, may be even some praise,
> > >> ... in any case some sign people are "thinking along" (as we say
> > >> in German).
> > >>
> > >> In the mean time, I've actually thought along the one case which
> > >> is last above: The <op> (binary operation) between a
> > >> non-0-length array and a 0-length vector (and NULL which should
> > >> be treated like a 0-length vector):
> > >>
> > >> R <= 3.3.1 *is* quite inconsistent with these:
> > >>
> > >>
> > >> and my proposal above (implemented in R-devel, since Sep.5) would
> > give an
> > >> error for all these, but instead, R really could be more lenient
> > here:
> > >> A 0-length result is ok, and it should *not* inherit the array
> > >> (dim, dimnames), since the array is not of length 0. So instead
> > >> of the above [for the very last part only!!], we would aim for
> > >> the following. These *all* give an error in current R-devel,
> > >> with the exception of 'm1 + NULL' which "only" gives a "bad
> > >> warning" :
> > >>
> > >> ------------------------
> > >>
> > >> m1 <- matrix(1,1)
> > >> m2 <- matrix(1,2)
> > >>
> > >> m1 + NULL # numeric(0) in R <= 3.3.x ---> OK ?!
> > >> m1 > NULL # logical(0) in R <= 3.3.x ---> OK ?!
> > >> try(m1 & NULL) # ERROR in R <= 3.3.x ---> change to logical(0)
> > ?!
> > >> try(m1 | double())# ERROR in R <= 3.3.x ---> change to logical(0)
> > ?!
> > >> ## m2 slightly different:
> > >> try(m2 + NULL) # ERROR in R <= 3.3.x ---> change to double(0) ?!
> > >> try(m2 & NULL) # ERROR in R <= 3.3.x ---> change to logical(0)
> ?!
> > >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK ?!
> > >>
> > >> ------------------------
> > >>
> > >> This would be slightly more back-compatible than the currently
> > >> implemented proposal. Everything else I said remains true, and
> > >> I'm pretty sure most changes needed in packages would remain to be
> > done.
> > >>
> > >> Opinions ?
> > >>
> > >>
> > >>
> > >> > In some case where R-devel now gives an error but did not
> > >> > previously, we could contemplate giving another "warning
> > >> > .... 'to become ERROR'" if there was too much breakage, though
> > >> > I don't expect that.
> > >>
> > >>
> > >> > For the R Core Team,
> > >>
> > >> > Martin Maechler,
> > >> > ETH Zurich
> > >>
> > >> ______________________________________________
> > >> R-devel at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-devel
> > >>
> >
> >
> >
> > > --
> > > Robin Hankin
> > > Neutral theorist
> > > hankin.robin at gmail.com
> >
> > > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
>
> --
> Gabriel Becker, PhD
> Associate Scientist (Bioinformatics)
> Genentech Research
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list