[R] Undesired result
Val
v@|kremk @end|ng |rom gm@||@com
Wed Feb 17 19:45:54 CET 2021
Very helpful and thank you so much!
On Wed, Feb 17, 2021 at 12:50 PM Duncan Murdoch
<murdoch.duncan using gmail.com> wrote:
>
> On 17/02/2021 9:50 a.m., Val wrote:
> > HI All,
> >
> > I am reading a data file which has different date formats. I wanted to
> > standardize to one format and used a library anytime but got
> > undesired results as shown below. It gave me year 2093 instead of 1993
> >
> >
> > library(anytime)
> > DFX<-read.table(text="name ddate
> > A 19-10-02
> > D 11/19/2006
> > F 9/9/2011
> > G1 12/29/2010
> > AA 10/18/93 ",header=TRUE)
> > getFormats()
> > addFormats(c("%d-%m-%y"))
> > addFormats(c("%m-%d-%y"))
> > addFormats(c("%Y/%d/%m"))
> > addFormats(c("%m/%d/%y"))
> >
> > DFX$anew=anydate(DFX$ddate)
> >
> > Output
> > name ddate anew
> > 1 A 19-10-02 2002-10-19
> > 2 D 11/19/2006 2020-11-19
> > 3 F 9/9/2011 2011-09-09
> > 4 G1 12/29/2010 2020-12-29
> > 5 AA 10/18/93 2093-10-18
> >
> > The problem is in the last row. It should be 1993-10-18 instead of 2093-10-18
> >
> > How do I correct this?
>
> This looks a little tricky. The basic idea is that the %y format has to
> guess at the century, but the guess depends on things specific to your
> system. So what would be nice is to say "two digit dates should be
> assumed to fall between 1922 and 2021", but there's no way to do that
> directly.
>
> What you could do is recognize when you have a two digit year, and then
> force the result into the range you want. Here's a function that does
> that, but it's not really tested much at all, so be careful if you use
> it. (One thing: I recommend the 'useR = TRUE' option to anydate(); it
> worked better in my tests than the default.)
>
> adjustCentury <- function(inputString,
> outputDate = anydate(inputString, useR = TRUE),
> start = "1922-01-01") {
>
> start <- as.Date(start)
>
> twodigityear <- !grepl("[[:digit:]]{4}", inputString)
>
> while (length(bad <- which(twodigityear & outputDate < start))) {
> for (i in bad) {
> longdate <- as.POSIXlt(outputDate[i])
> longdate$year <- longdate$year + 100
> outputDate[i] <- as.Date(longdate)
> }
> }
> longdate <- as.POSIXlt(start)
> longdate$year <- longdate$year + 100
> finish <- as.Date(longdate)
>
> while (length(bad <- which(twodigityear & outputDate >= finish))) {
> for (i in bad) {
> longdate <- as.POSIXlt(outputDate[i])
> longdate$year <- longdate$year - 100
> outputDate[i] <- as.Date(longdate)
> }
> }
> outputDate
> }
>
> library(anytime)
> DFX<-read.table(text="name ddate
> A 19-10-02
> D 11/19/2006
> F 9/9/2011
> G1 12/29/2010
> AA 10/18/93
> BB 10/18/1893
> CC 10/18/2093",header=TRUE)
>
> addFormats(c("%d-%m-%y"))
> addFormats(c("%m-%d-%y"))
> addFormats(c("%Y/%d/%m"))
> addFormats(c("%m/%d/%y"))
>
> DFX$anew=adjustCentury(DFX$ddate, start = "1921-01-01")
> DFX
> #> name ddate anew
> #> 1 A 19-10-02 2019-10-02
> #> 2 D 11/19/2006 2006-11-19
> #> 3 F 9/9/2011 2011-09-09
> #> 4 G1 12/29/2010 2010-12-29
> #> 5 AA 10/18/93 1993-10-18
> #> 6 BB 10/18/1893 1893-10-18
> #> 7 CC 10/18/2093 2093-10-18
More information about the R-help
mailing list