[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
Paul Gilbert
pgilbert902 at gmail.com
Fri Sep 9 21:11:30 CEST 2016
On 09/08/2016 05:06 PM, robin hankin wrote:
> Could we take a cue from min() and max()?
>
>> x <- 1:10
>> min(x[x>7])
> [1] 8
>> min(x[x>11])
> [1] Inf
> Warning message:
> In min(x[x > 11]) : no non-missing arguments to min; returning Inf
>>
>
> As ?min says, this is implemented to preserve transitivity, and this
> makes a lot of sense.
> I think the issuing of a warning here is a good compromise; I can
> always turn off warnings if I want.
I fear you are thinking of this as an end user, rather than as a package
developer. Warnings are for end users, when they do something they
possibly should be warned about. A package really should not generate
warnings unless they are for end user consumption. In package
development I treat warnings the same way I treat errors: build fails,
program around it. So what you call a compromise is no compromise at all
as far as I am concerned.
But perhaps there is a use for an end user version, maybe All() or ALL()
that issues an error or warning. There are a lot of functions and
operators in R that could warn about mistakes that a user may be making.
Paul
>
> I find this behaviour of min() and max() to be annoying in the *right*
> way: it annoys me precisely when I need to be
> annoyed, that is, when I haven't thought through the consequences of
> sending zero-length arguments.
>
>
> On Fri, Sep 9, 2016 at 6:00 AM, Paul Gilbert <pgilbert902 at gmail.com> wrote:
>>
>>
>> On 09/08/2016 01:22 PM, Gabriel Becker wrote:
>>>
>>> On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap <wdunlap at tibco.com> wrote:
>>>
>>>> Shouldn't binary operators (arithmetic and logical) should throw an error
>>>> when one operand is NULL (or other type that doesn't make sense)? This
>>>> is
>>>> a different case than a zero-length operand of a legitimate type. E.g.,
>>>> any(x < 0)
>>>> should return FALSE if x is number-like and length(x)==0 but give an
>>>> error
>>>> if x is NULL.
>>>>
>>> Bill,
>>>
>>> That is a good point. I can see the argument for this in the case that the
>>> non-zero length is 1. I'm not sure which is better though. If we switch
>>> any() to all(), things get murky.
>>>
>>> Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
>>> all(x>0)), but the likelihood of this being a thought-bug on the author's
>>> part is exceedingly high, imho.
>>
>>
>> I suspect there may be more R users than you think that understand and use
>> vacuously true in code. I don't really like the idea of turning a perfectly
>> good and properly documented mathematical test into an error in order to
>> protect against a possible "thought-bug".
>>
>> Paul
>>
>>
>> So the desirable behavior seems to depend
>>>
>>> on the angle we look at it from.
>>>
>>> My personal opinion is that x < y with length(x)==0 should fail if
>>> length(y)
>>>>
>>>> 1, at least, and I'd be for it being an error even if y is length 1,
>>>
>>> though I do acknowledge this is more likely (though still quite unlikely
>>> imho) to be the intended behavior.
>>>
>>> ~G
>>>
>>>>
>>>> I.e., I think the type check should be done before the length check.
>>>>
>>>>
>>>> Bill Dunlap
>>>> TIBCO Software
>>>> wdunlap tibco.com
>>>>
>>>> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbecker at ucdavis.edu>
>>>> wrote:
>>>>
>>>>> Martin,
>>>>>
>>>>> Like Robin and Oliver I think this type of edge-case consistency is
>>>>> important and that it's fantastic that R-core - and you personally - are
>>>>> willing to tackle some of these "gotcha" behaviors. "Little" stuff like
>>>>> this really does combine to go a long way to making R better and better.
>>>>>
>>>>> I do wonder a bit about the
>>>>>
>>>>> x = 1:2
>>>>>
>>>>> y = NULL
>>>>>
>>>>> x < y
>>>>>
>>>>> case.
>>>>>
>>>>> Returning a logical of length 0 is more backwards compatible, but is it
>>>>> ever what the author actually intended? I have trouble thinking of a
>>>>> case
>>>>> where that less-than didn't carry an implicit assumption that y was
>>>>> non-NULL. I can say that in my own code, I've never hit that behavior
>>>>> in
>>>>> a
>>>>> case that wasn't an error.
>>>>>
>>>>> My vote (unless someone else points out a compelling use for the
>>>>> behavior)
>>>>> is for the to throw an error. As a developer, I'd rather things like
>>>>> this
>>>>> break so the bug in my logic is visible, rather than propagating as the
>>>>> 0-length logical is &'ed or |'ed with other logical vectors, or used to
>>>>> subset, or (in the case it should be length 1) passed to if() (if throws
>>>>> an
>>>>> error now, but the rest would silently "work").
>>>>>
>>>>> Best,
>>>>> ~G
>>>>>
>>>>> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
>>>>> maechler at stat.math.ethz.ch>
>>>>> wrote:
>>>>>
>>>>>>>>>>> robin hankin <hankin.robin at gmail.com>
>>>>>>>>>>> on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>>>>>>
>>>>>>
>>>>>> > Martin I'd like to make a comment; I think that R's
>>>>>> > behaviour on 'edge' cases like this is an important thing
>>>>>> > and it's great that you are working on it.
>>>>>>
>>>>>> > I make heavy use of zero-extent arrays, chiefly because
>>>>>> > the dimnames are an efficient and logical way to keep
>>>>>> > track of certain types of information.
>>>>>>
>>>>>> > If I have, for example,
>>>>>>
>>>>>> > a <- array(0,c(2,0,2))
>>>>>> > dimnames(a) <- list(name=c('Mike','Kevin'),
>>>>>> NULL,item=c("hat","scarf"))
>>>>>>
>>>>>>
>>>>>> > Then in R-3.3.1, 70800 I get
>>>>>>
>>>>>> a> 0
>>>>>> > logical(0)
>>>>>> >>
>>>>>>
>>>>>> > But in 71219 I get
>>>>>>
>>>>>> a> 0
>>>>>> > , , item = hat
>>>>>>
>>>>>>
>>>>>> > name
>>>>>> > Mike
>>>>>> > Kevin
>>>>>>
>>>>>> > , , item = scarf
>>>>>>
>>>>>>
>>>>>> > name
>>>>>> > Mike
>>>>>> > Kevin
>>>>>>
>>>>>> > (which is an empty logical array that holds the names of the
>>>>>
>>>>> people
>>>>>>
>>>>>> and
>>>>>> > their clothes). I find the behaviour of 71219 very much
>>>>>> preferable
>>>>>> because
>>>>>> > there is no reason to discard the information in the dimnames.
>>>>>>
>>>>>> Thanks a lot, Robin, (and Oliver) !
>>>>>>
>>>>>> Yes, the above is such a case where the new behavior makes much sense.
>>>>>> And this behavior remains identical after the 71222 amendment.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>> > Best wishes
>>>>>> > Robin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
>>>>>> maechler at stat.math.ethz.ch>
>>>>>> > wrote:
>>>>>>
>>>>>> >> >>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>> >> >>>>> on Tue, 6 Sep 2016 22:26:31 +0200 writes:
>>>>>> >>
>>>>>> >> > Yesterday, changes to R's development version were committed,
>>>>>> >> relating
>>>>>> >> > to arithmetic, logic ('&' and '|') and
>>>>>> >> > comparison/relational ('<', '==') binary operators
>>>>>> >> > which in NEWS are described as
>>>>>> >>
>>>>>> >> > SIGNIFICANT USER-VISIBLE CHANGES:
>>>>>> >>
>>>>>> >> > [.............]
>>>>>> >>
>>>>>> >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
>>>>>> >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
>>>>>> >> > behave consistently, notably for arrays of length zero.
>>>>>> >>
>>>>>> >> > Arithmetic between length-1 arrays and longer non-arrays had
>>>>>> >> > silently dropped the array attributes and recycled. This
>>>>>> >> > now gives a warning and will signal an error in the future,
>>>>>> >> > as it has always for logic and comparison operations in
>>>>>> >> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and
>>>>>> >> > ‘matrix(1,1) < 2:3’).
>>>>>> >>
>>>>>> >> > As the above "visually suggests" one could think of the
>>>>>> changes
>>>>>> >> > falling mainly two groups,
>>>>>> >> > 1) <0-extent array> (op) <non-array>
>>>>>> >> > 2) <1-extent array> (arith) <non-array of length != 1>
>>>>>> >>
>>>>>> >> > These changes are partly non-back compatible and may break
>>>>>> >> > existing code. We believe that the internal consistency
>>>>>> gained
>>>>>> >> > from the changes is worth the few places with problems.
>>>>>> >>
>>>>>> >> > We expect some package maintainers (10-20, or even more?) need
>>>>>> >> > to adapt their code.
>>>>>> >>
>>>>>> >> > Case '2)' above mainly results in a new warning, e.g.,
>>>>>> >>
>>>>>> >> >> matrix(1,1) + 1:2
>>>>>> >> > [1] 2 3
>>>>>> >> > Warning message:
>>>>>> >> > In matrix(1, 1) + 1:2 :
>>>>>> >> > dropping dim() of array of length one. Will become ERROR
>>>>>> >> >>
>>>>>> >>
>>>>>> >> > whereas '1)' gives errors in cases the result silently was a
>>>>>> >> > vector of length zero, or also keeps array (dim & dimnames) in
>>>>>> >> > cases these were silently dropped.
>>>>>> >>
>>>>>> >> > The following is a "heavily" commented R script showing (all
>>>>>
>>>>> ?)
>>>>>>
>>>>>> >> > the important cases with changes :
>>>>>> >>
>>>>>> >> > ------------------------------------------------------------
>>>>>> >> ----------------
>>>>>> >>
>>>>>> >> > (m <- cbind(a=1[0], b=2[0]))
>>>>>> >> > Lm <- m; storage.mode(Lm) <- "logical"
>>>>>> >> > Im <- m; storage.mode(Im) <- "integer"
>>>>>> >>
>>>>>> >> > ## 1. -------------------------
>>>>>> >> > try( m & NULL ) # in R <= 3.3.x :
>>>>>> >> > ## Error in m & NULL :
>>>>>> >> > ## operations are possible only for numeric, logical or
>>>>>
>>>>> complex
>>>>>>
>>>>>> >> types
>>>>>> >> > ##
>>>>>> >> > ## gives 'Lm' in R >= 3.4.0
>>>>>> >>
>>>>>> >> > ## 2. -------------------------
>>>>>> >> > m + 2:3 ## gave numeric(0), now remains matrix identical to m
>>>>>> >> > Im + 2:3 ## gave integer(0), now remains matrix identical to
>>>>>> Im
>>>>>> >> (integer)
>>>>>> >>
>>>>>> >> > m > 1 ## gave logical(0), now remains matrix identical to
>>>>>
>>>>> Lm
>>>>>>
>>>>>> >> (logical)
>>>>>> >> > m > 0.1[0] ## ditto
>>>>>> >> > m > NULL ## ditto
>>>>>> >>
>>>>>> >> > ## 3. -------------------------
>>>>>> >> > mm <- m[,c(1:2,2:1,2)]
>>>>>> >> > try( m == mm ) ## now gives error "non-conformable arrays",
>>>>>> >> > ## but gave logical(0) in R <= 3.3.x
>>>>>> >>
>>>>>> >> > ## 4. -------------------------
>>>>>> >> > str( Im + NULL) ## gave "num", now gives "int"
>>>>>> >>
>>>>>> >> > ## 5. -------------------------
>>>>>> >> > ## special case for arithmetic w/ length-1 array
>>>>>> >> > (m1 <- matrix(1,1,1, dimnames=list("Ro","col")))
>>>>>> >> > (m2 <- matrix(1,2,1, dimnames=list(c("A","B"),"col")))
>>>>>> >>
>>>>>> >> > m1 + 1:2 # -> 2:3 but now with warning to "become ERROR"
>>>>>> >> > tools::assertError(m1 & 1:2)# ERR: dims [product 1] do not
>>>>>
>>>>> match
>>>>>>
>>>>>> the
>>>>>> >> length of object [2]
>>>>>> >> > tools::assertError(m1 < 1:2)# ERR: (ditto)
>>>>>> >> > ##
>>>>>> >> > ## non-0-length arrays combined with {NULL or double() or ...}
>>>>>> *fail*
>>>>>> >>
>>>>>> >> > ### Length-1 arrays: Arithmetic with |vectors| > 1 treated
>>>>>
>>>>> array
>>>>>>
>>>>>> >> as scalar
>>>>>> >> > m1 + NULL # gave numeric(0) in R <= 3.3.x --- still, *but* w/
>>>>>> >> warning to "be ERROR"
>>>>>> >> > try(m1 > NULL) # gave logical(0) in R <= 3.3.x --- an
>>>>>
>>>>> *error*
>>>>>>
>>>>>> >> now in R >= 3.4.0
>>>>>> >> > tools::assertError(m1 & NULL) # gave and gives error
>>>>>> >> > tools::assertError(m1 | double())# ditto
>>>>>> >> > ## m2 was slightly different:
>>>>>> >> > tools::assertError(m2 + NULL)
>>>>>> >> > tools::assertError(m2 & NULL)
>>>>>> >> > try(m2 == NULL) ## was logical(0) in R <= 3.3.x; now error as
>>>>>> above!
>>>>>> >>
>>>>>> >> > ------------------------------------------------------------
>>>>>> >> ----------------
>>>>>> >>
>>>>>> >>
>>>>>> >> > Note that in R's own 'nls' sources, there was one case of
>>>>>> >> > situation '2)' above, i.e. a 1x1-matrix was used as a
>>>>>
>>>>> "scalar".
>>>>>>
>>>>>> >>
>>>>>> >> > In such cases, you should explicitly coerce it to a vector,
>>>>>> >> > either ("self-explainingly") by as.vector(.), or as I did in
>>>>>> >> > the nls case by c(.) : The latter is much less
>>>>>> >> > self-explaining, but nicer to read in mathematical formulae,
>>>>>
>>>>> and
>>>>>>
>>>>>> >> > currently also more efficient because it is a .Primitive.
>>>>>> >>
>>>>>> >> > Please use R-devel with your code, and let us know if you see
>>>>>> >> > effects that seem adverse.
>>>>>> >>
>>>>>> >> I've been slightly surprised (or even "frustrated") by the empty
>>>>>> >> reaction on our R-devel list to this post.
>>>>>> >>
>>>>>> >> I would have expected some critique, may be even some praise,
>>>>>> >> ... in any case some sign people are "thinking along" (as we say
>>>>>> >> in German).
>>>>>> >>
>>>>>> >> In the mean time, I've actually thought along the one case which
>>>>>> >> is last above: The <op> (binary operation) between a
>>>>>> >> non-0-length array and a 0-length vector (and NULL which should
>>>>>> >> be treated like a 0-length vector):
>>>>>> >>
>>>>>> >> R <= 3.3.1 *is* quite inconsistent with these:
>>>>>> >>
>>>>>> >>
>>>>>> >> and my proposal above (implemented in R-devel, since Sep.5)
>>>>>> would
>>>>>> give an
>>>>>> >> error for all these, but instead, R really could be more lenient
>>>>>> here:
>>>>>> >> A 0-length result is ok, and it should *not* inherit the array
>>>>>> >> (dim, dimnames), since the array is not of length 0. So instead
>>>>>> >> of the above [for the very last part only!!], we would aim for
>>>>>> >> the following. These *all* give an error in current R-devel,
>>>>>> >> with the exception of 'm1 + NULL' which "only" gives a "bad
>>>>>> >> warning" :
>>>>>> >>
>>>>>> >> ------------------------
>>>>>> >>
>>>>>> >> m1 <- matrix(1,1)
>>>>>> >> m2 <- matrix(1,2)
>>>>>> >>
>>>>>> >> m1 + NULL # numeric(0) in R <= 3.3.x ---> OK ?!
>>>>>> >> m1 > NULL # logical(0) in R <= 3.3.x ---> OK ?!
>>>>>> >> try(m1 & NULL) # ERROR in R <= 3.3.x ---> change to
>>>>>> logical(0)
>>>>>> ?!
>>>>>> >> try(m1 | double())# ERROR in R <= 3.3.x ---> change to
>>>>>> logical(0)
>>>>>> ?!
>>>>>> >> ## m2 slightly different:
>>>>>> >> try(m2 + NULL) # ERROR in R <= 3.3.x ---> change to double(0)
>>>>>
>>>>> ?!
>>>>>>
>>>>>> >> try(m2 & NULL) # ERROR in R <= 3.3.x ---> change to logical(0)
>>>>>
>>>>> ?!
>>>>>>
>>>>>> >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK ?!
>>>>>> >>
>>>>>> >> ------------------------
>>>>>> >>
>>>>>> >> This would be slightly more back-compatible than the currently
>>>>>> >> implemented proposal. Everything else I said remains true, and
>>>>>> >> I'm pretty sure most changes needed in packages would remain to
>>>>>
>>>>> be
>>>>>>
>>>>>> done.
>>>>>> >>
>>>>>> >> Opinions ?
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> > In some case where R-devel now gives an error but did not
>>>>>> >> > previously, we could contemplate giving another "warning
>>>>>> >> > .... 'to become ERROR'" if there was too much breakage,
>>>>>> though
>>>>>> >> > I don't expect that.
>>>>>> >>
>>>>>> >>
>>>>>> >> > For the R Core Team,
>>>>>> >>
>>>>>> >> > Martin Maechler,
>>>>>> >> > ETH Zurich
>>>>>> >>
>>>>>> >> ______________________________________________
>>>>>> >> R-devel at r-project.org mailing list
>>>>>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>> >>
>>>>>>
>>>>>>
>>>>>>
>>>>>> > --
>>>>>> > Robin Hankin
>>>>>> > Neutral theorist
>>>>>> > hankin.robin at gmail.com
>>>>>>
>>>>>> > [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gabriel Becker, PhD
>>>>> Associate Scientist (Bioinformatics)
>>>>> Genentech Research
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>
>>>>
>>>
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
More information about the R-devel
mailing list