[Rd] Bug with `[<-.POSIXlt` on specific OSes
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Sat Oct 22 14:12:47 CEST 2022
>>>>> Martin Maechler
>>>>> on Tue, 18 Oct 2022 10:56:25 +0200 writes:
>>>>> Suharto Anggono Suharto Anggono via R-devel
>>>>> on Fri, 14 Oct 2022 16:21:14 +0000 (UTC) writes:
>> I think '[.POSIXlt' and '[<-.POSIXlt' don't need to
>> normalize out-of-range values. I think they just make
>> same length for all components, to ensure correct
>> extraction or replacement for arbitrary index.
> Yes, you are right; this is definitely correct... and
> would be more efficient.
> At the moment, we were mostly focused on *correct*
> behaviour in the case of "ragged" and/or out-of-range
> POSIXlt objects.
>> I have a thought of adding an optional argument for
>> 'as.POSIXlt' applied to "POSIXlt" object. Possible name:
>> normalize adjust fixup
>> To allow recycling only without changing content, instead
>> of TRUE or FALSE, maybe choice, like fixup = c("none",
>> "balance", "normalize") , where "normalize" implies
>> "balance", or adjust = c("none", "length", "content",
>> "value") , where "content" and "value" are synonymous.
> Such an optional argument for as.POSIXlt() would be a
> possibility and could replace the new and for now still
> somewhat experimental balancePOSIXlt().
> +: One advantage of (one of the above proposals) would
> be that it does not take up a new function name.
> -: OTOH, it may be overdoing the semantics
> as.POSIXlt(<POSIXlt>, <some> = <other>)
> and it may be harder to understand by
> non-sophisticated R users, because as.POSIXlt() is a
> generic with several methods, and these extra arguments
> would probably only apply to the as.POSIXlt.default()
> method and there *only* for the case where the argument
> inherits from "POSIXlt" .. and all that being somewhat
> subtle to see for Joe Average UseR
> I agree that it will make sense to get an R-level
> version, either using new arguments in as.POSIXlt() or
> (still my preference) in balancePOSIXlt() to allow to
> "only fill all components".
> HOWEVER note that the "filling" (by recycling) and no
> extra checking will often lead to internally
> inconsistent lt objects. Eg. Daylight saving time
> (isdst = 1 or not) can only be known when the day (and
> hour) is known and that can be shifted by out-of-range
> sec/min/hour .. ((and of course for 1 hour per year, a
> time hour=2 will *need* specification of isdst in order
> to know which of the 2:<min>:<sec> is meant)) also $wday
> and $yday (who are described as read-only) also can only
> be checked after validation or "in-ranging" of the
> sec/min/hour/mday/mon components so their simple
> recycling will typically be incorrect.
> That's why I had opted to *mainly* do full "balancing"
> (in my sense), i.e., simultaneous both filling and
> "in-ranging".
A few hours ago [R-devel svn rev 83156; 2022-10-22 10:18:38 +0200]
I have committed an enhanced version of balancePOSIXlt() which
now has an optional 'fill.only = F/T' rgument.
When TRUE (not by default), it will only do the "filling", i.e.,
recyclying of less-than-full-length components, without any
"in-ranging" nor musch further validity checking.
Currently, almost all POSIXlt methods using balancePOSIXlt(),
notably
[.POSIXlt and [<-.POSIXlt
use balancePOSIXlt(x, fill.only=TRUE ..)
and hence are almost as fast as previously (when they did no
balancing and gave sometimes wrong results or errored in case of
partially filled POSIXlt).
>> By the way, Inf in 'sec' component is out-of-range!
> Yes, the non-finite "values" {+/-Inf, NaN, NA} are all
> "special", and we had decided to allow them for
> compatibility with classes "Date" and "POSIXct".
> BTW, a few days ago, I have updated the
> help("DateTimeClasses") page in R-devel to document a
> bit more, notably that "ragged" and out-of-range POSIXlt
> may exist... see (the always +- current R-devel Help
> pages at)
> https://stat.ethz.ch/R-manual/R-devel/library/base/html/DateTimeClasses.html
>> For 'gmtoff', NA or 0 should be put for unknown. A known
>> 'gmtoff' may be [ositive, negative, or zero. The
>> documentation says ‘gmtoff’ (Optional.) The offset in
>> seconds from GMT: positive values are East of the
>> meridian. Usually ‘NA’ if unknown, but ‘0’ could mean
>> unknown.
>> dlt <- .POSIXlt(list(sec = c(-999, 10000 + c(1:10,-Inf,
>> NA)) + pi, # "out of range", non-finite, fractions min =
>> 45L, hour = c(21L, 3L, NA, 4L), mday = 6L, mon = c(11L,
>> NA, 3L), year = 116L, wday = 2L, yday = 340L, isdst =
>> 1L))
>> as.POSIXct(dlt)[1] is NA on Linux with timezone without
>> DST. For example, after Sys.setenv(TZ = "EST")
> Hmm... I needed time to look at the above. Indeed, one
> gets NA (and has in previous versions of R) in such a
> case.
> After applying balancePOSIXlt(), one no longer gets NA.
> Are you proposing that we should do that (or possibly
> simple recycling) in as.POSIXct.POSIXlt() ?
I am still waiting for comments (also by others) or other
remarks or answers on this question/topic..
Martin
More information about the R-devel
mailing list