[R] Problem with diff(strptime(...

Jim Lemon jim at bitwrit.com.au
Thu Mar 20 11:59:06 CET 2008


Hi all,

I have been chipping away at a problem I encountered in calculating 
rates per year from a moderately large data file (46412 rows). When I 
ran the following command, I got obviously wrong output:

interval<-
  c(NA,as.numeric(diff(
  strptime(mkdf$MEAS_DATE,"%d/%m/%Y")))/365.25)

The values in MEAS_DATE looked like this:

mkdf$MEAS_DATE[1:10]
  [1] 1/5/1962  1/5/1963  1/5/1964  1/3/1965  1/4/1966  1/4/1967
  1/6/1968
  [8] 25/3/1969 1/4/1971  1/2/1974
146 Levels: 10/10/1967 1/10/1947 1/10/1965 1/10/1967 1/10/1983 ... 9/1/1992

To abbreviate three evenings of work, I finally found that values 17170 
and 17171 were the same. If I ran the entire set, or anything over 
1:17170, I would get output like this:

interval[1:10]
  [1]        NA  86340.86  86577.41  71911.29  93673.92  86340.86
  101006.98
  [8]  70255.44 174337.58 245292.81

If I ran any set of values up to 17170, I would get the correct output:

interval[1:10]
  [1]        NA 0.9993155 1.0020534 0.8323066 1.0841889 0.9993155
  1.1690623
  [8] 0.8131417 2.0177960 2.8390372

If I changed value 17171 by one day (and added that level), the command 
worked correctly:

interval[1:10]
  [1]        NA 0.9993155 1.0020534 0.8323066 1.0841889 0.9993155
  1.1690623
  [8] 0.8131417 2.0177960 2.8390372

There have been a few messages about this problem, but apparently no 
solution. The problem can be seen with these examples (I haven't 
included the real data as it is not mine):

foodate<-c("1/7/1991","1/8/1991","1/8/1991","3/8/1991")
as.numeric(diff(strptime(foodate,"%d/%m/%Y"))/365.25)
[1] 7333.0595    0.0000  473.1006

foodate<-factor(c("1/7/1991","1/8/1991","1/8/1991","3/8/1991"))
as.numeric(diff(strptime(foodate,"%d/%m/%Y"))/365.25)
[1] 7333.0595    0.0000  473.1006

foodate<-factor(c("1/7/1991","1/8/1991","2/8/1991","3/8/1991"))
 > as.numeric(diff(strptime(foodate,"%d/%m/%Y"))/365.25)
[1] 0.084873374 0.002737851 0.002737851

Beats me.

Jim



More information about the R-help mailing list