[Rd] A potential POSIXlt->Date bug introduced in r-devel
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri Oct 7 14:52:01 CEST 2022
>>>>> Martin Maechler
>>>>> on Thu, 6 Oct 2022 10:15:29 +0200 writes:
>>>>> Davis Vaughan
>>>>> on Wed, 5 Oct 2022 17:04:11 -0400 writes:
>> Hi all,
>> I think I have discovered a bug in the conversion from POSIXlt to Date that
>> has been introduced in r-devel.
>> It affects lubridate, but surprisingly didn't cause test failures there.
>> Instead it caused test failures in users of lubridate, like slider, arrow,
>> and admiral (see https://github.com/tidyverse/lubridate/issues/1069), and
>> at least in slider I have been asked by CRAN to correct this issue before
>> 2022-10-16.
>> In r-devel we get the following:
>> ```
>> data <- list(
>> sec = 0,
>> min = 0L,
>> hour = 0L,
>> mday = 31L,
>> mon = c(0L, NA, 2L),
>> year = 113L,
>> wday = 4L,
>> yday = 30L,
>> isdst = 0L
>> )
>> x <- .POSIXlt(xx = data, tz = "UTC")
>> x
>> #> [1] "2013-01-31 UTC" NA "2013-03-31 UTC"
>> # Looks right
>> as.POSIXct(x)
>> #> [1] "2013-01-31 UTC" NA "2013-03-31 UTC"
>> # Weird, where is the `NA`?
>> as.Date(x)
>> #> [1] "2013-01-31" "1970-01-01" "2013-03-31"
>> ```
> I agree that the above is wrong, i.e., a bug in current R-devel.
>> The POSIXlt object is length 3, but is only partially filled out.
>> The other elements are all recycled to length 3 upon
>> conversion to POSIXct or Date.
>> But when converting to Date, we lose the `NA` value. I think the
>> `as.Date()` conversion seems inconsistent with the `as.POSIXct()`
>> conversion.
> Yes. There was another very much relatd conversation here on R-devel,
> initiated by Suharto Anggono just a few days ago.
> This subject, i.e., "partially filled out" POSIXlt objects, was
> one of the topics, too.
> See my reply there, notably at the end:
> https://stat.ethz.ch/pipermail/r-devel/2022-October/082072.html
> I do mention that "recycling" of partially filled POSIXlt
> objects has only partially been implemented in R more generally
> and was actually asking for comments and further discussion.
>> It looks like this comes up because the conversion to Date now defaults to
>> using `sec` if any of the date-like fields are `NA_INTEGER`,
> yes, because only that allows to also deal with +/- Inf etc,
> as was recently added as new feature, see the NEWS of R 4.2.0
> • Not strictly fixing a bug, format()ing and print()ing of
> non-finite Date and POSIXt values NaN and +/-Inf no longer show
> as NA but the respective string, e.g., Inf, for consistency with
> numeric vector's behaviour, fulfilling the wish of PR#18308.
> i.e., see also R's bugzilla
> https://bugs.r-project.org/show_bug.cgi?id=18308
> which actually *also* mentioned an NA problem in Date/Time objects.
>> but this means the `NA` in the `mon` field is ignored.
> which I agree is bogous and we'll fix.
> Still, I did not get any feedback on asking about documentation
> etc on POSIXlt objects ... and I *had* mentioned I agreed that
> the current partial implementation of "partially filled" i.e. recycling of
> POSIXlt components should probably be made part of the
> "definition" of POSIXlt.
> Have I overlooked an existing definition / contract about these?
> Martin
I'm still waiting for comments.
Note that "partially filled" POSIXlt do not work correctly in
any version of R. I mentioned that even length(.) may easily
fail; but there is much more.
While I can relatively easily fix Davis' case above,
the following example behaves wrongly in current and previous
released versions of R and in R-devel:
dlt <- .POSIXlt(list(sec = c(-999, 10000 + c(1:10,-Inf, NA)) + pi,
# "out of range", non-finite, fractions
min = 45L, hour = c(21L, 3L, NA, 4L),
mday = 6L, mon = c(11L, NA, 3L),
year = 116L, wday = 2L, yday = 340L, isdst = 1L))
Of course that's constructed to be particularly unpleasant.
You can try some of the following "checks" in your version(s) of
R to see that some of the things are misbehaving with it in all (*)
versions of R.
--
*) so I claim boldly
dct <- as.POSIXct(dlt)
(n <- length(dct))
dD <- as.Date(dlt)
dDc <- as.Date(dct)
dltN <- as.POSIXlt(dct) # "normalized POSIXlt" (with *lost* accuracy):
data.frame(unclass(dltN))
.POSIXltNormalize <- function(x) {
stopifnot(is.numeric(s <- x$sec))
x <- as.POSIXlt(as.POSIXct(x)) # and restore the precise seconds :
ifin <- is.finite(s) & is.finite(x$sec) # (maybe recycling already)
x$sec[ifin] <- s[ifin] %% 60
x
}
dlt2 <- .POSIXltNormalize(dlt) # normalized POSIXlt - with accuracy kept
all.equal(dlt2$sec, dltN$sec, tolerance = 0) # .. small (2e-9) difference
stopifnot(all.equal(dlt2, dltN),
identical(as.POSIXct(dlt2), as.POSIXct(dltN)))
## First show (in a way it also works for older R), then check :
oldR <- getRversion() < "4.2.2"
print(width = 101,
data.frame(dlt, dltN, asCT = dct, asDateCT = dDc,
asDate = if(oldR) rep_len(dD, n) else dD ,
na = is.na(dlt),
fin = if(oldR) rep_len(NA, n) else is.finite(dlt))
)
--
Martin Mächler
ETH Zurich and R Core team
More information about the R-devel
mailing list