[R] Generation of missiing values in a time serie...
Gabor Grothendieck
ggrothendieck at gmail.com
Tue Dec 13 19:11:28 CET 2005
Yes, this is the definition of a time series and therefore of a zoo object.
A time series is a mathematical function, i.e. it assigns a single element
of the range to each element of the domain. This data does not describe
a time series.
Also it has no underlying regularity as the warning message states.
To use as.ts one wants a series with an underlying regularity that has
gaps and then as.ts will fill in the gaps with NAs.
If we don't have an underlying regularity the question is not well posed
but its likely we want to discretize time. The zoo command itself is
somewhat forgiving, at least in this case, i.e. it allows one to specify
this illegal zoo object with non-unique times for purposes of discretization;
however, such a zoo object should not be used other than to get a legal
zoo object out.
For example, in the following we round the times to one decimal place
and then within each set of values at the same discretized time take the
last one. (Alternately specify mean instead of tail, 1 if the average
is prefered.) Then we convert that to a ts object:
> as.ts(aggregate(z, round(time(z), 1), tail, 1))
Time Series:
Start = c(123, 2)
End = c(123, 8)
Frequency = 10
time flow seq ts x rtt size
123.1 123.1257 0 967 123.1257 13394 0.798205 1472
123.2 123.2411 0 969 123.2411 12680 0.796258 1472
123.3 NA NA NA NA NA NA NA
123.4 NA NA NA NA NA NA NA
123.5 123.4726 0 970 123.4726 12680 0.796258 1472
123.6 123.5886 0 971 123.5886 12680 0.796258 1472
123.7 123.7046 0 972 123.7046 12680 0.796258 1472
On 12/13/05, Alvaro Saurin <saurin at dcs.gla.ac.uk> wrote:
>
> I think I have found the error. It appears when there are two entries
> with the same time. Using as input file:
>
> --------- CUT --------
> # Output format for PCKs:
> # TIME FLOW P [+-] SEQ TS X RTT SIZE
> #
> 123.125683 0 P + 967 123.125683 13394 0.798205 1472
> 123.241137 0 P + 968 123.241137 12680 0.796258 1472
> 123.241137 0 P + 969 123.241137 12680 0.796258 1472
> 123.472631 0 P + 970 123.472631 12680 0.796258 1472
> 123.588613 0 P + 971 123.588613 12680 0.796258 1472
> 123.704594 0 P + 972 123.704594 12680 0.796258 1472
> --------- CUT --------
>
> I run fhe following code:
>
> --------- CUT --------
> h_types <- list (0, 0, NULL, NULL, 0, 0, 0, 0, 0)
> h_names <- list ("time", "flow", "seq", "ts", "x", "rtt", "size")
>
> pcks_file <- pipe ("grep ' P ' data", "r")
> pcks <- scan (pcks_file, what = h_types, comment.char = '#',
> fill = TRUE)
> mat_df <- data.frame (pcks[1:2], pcks[5:9])
> mat <- as.matrix (mat_df)
> colnames (mat) <- h_names
> z <- zoo (mat, mat [,"time"])
> --------- CUT --------
>
> The dput of 'z' shows:
>
> --------- CUT --------
> structure(c(123.125683, 123.241137, 123.241137, 123.472631, 123.588613,
> 123.704594, 0, 0, 0, 0, 0, 0, 967, 968, 969, 970, 971, 972, 123.125683,
> 123.241137, 123.241137, 123.472631, 123.588613, 123.704594, 13394,
> 12680, 12680, 12680, 12680, 12680, 0.798205, 0.796258, 0.796258,
> 0.796258, 0.796258, 0.796258, 1472, 1472, 1472, 1472, 1472, 1472
> ), .Dim = c(6, 7), .Dimnames = list(c("1", "2", "3", "4", "5",
> "6"), c("time", "flow", "seq", "ts", "x", "rtt", "size")), index =
> structure(c(123.125683,
> 123.241137, 123.241137, 123.472631, 123.588613, 123.704594), .Names =
> c("1",
> "2", "3", "4", "5", "6")), class = "zoo")
> --------- CUT --------
>
> If I try a 'as.ts(z)', it fails. If I remove the duplicate entry, I
> can convert it to a TS with no problem. Is this made intentionally?
> Because then I have to filter the input matrix... But, anyway, the
> output matrix, after filtering, doesn't seem regular:
>
> --------- CUT --------
> > as.ts (z)
> Time Series:
> Start = 1
> End = 5
> Frequency = 1
> time flow seq ts x rtt size
> 1 123.1257 0 967 123.1257 13394 0.798205 1472
> 2 123.2411 0 969 123.2411 12680 0.796258 1472
> 3 123.4726 0 970 123.4726 12680 0.796258 1472
> 4 123.5886 0 971 123.5886 12680 0.796258 1472
> 5 123.7046 0 972 123.7046 12680 0.796258 1472
> Warning message:
> 'x' does not have an underlying regularity in: as.ts.zoo(z)
> --------- CUT --------
>
> Weird...
>
>
> On 13 Dec 2005, at 16:33, Gabor Grothendieck wrote:
>
> > Please provide a reproducible example. Note that dput(x) will output
> > an R object in a way that can be copied and pasted into another
> > session.
> >
> > On 12/13/05, Alvaro Saurin <saurin at dcs.gla.ac.uk> wrote:
> >>
> >> On 13 Dec 2005, at 13:08, Gabor Grothendieck wrote:
> >>
> >>> Your variable mat is not a matrix; its a data frame. Check it with:
> >>>
> >>> class(mat)
> >>>
> >>> Here is an example:
> >>>
> >>> x <- cbind(A = 1:4, B = 5:8)
> >>> tt <- c(1, 3:4, 6)
> >>>
> >>> library(zoo)
> >>> x.zoo <- zoo(x, tt)
> >>> x.ts <- as.ts(x.zoo)
> >>
> >> Fixed, but anyway it fails:
> >>
> >>> h_types <- list (0, 0, NULL, NULL, 0, 0, 0, 0, 0)
> >>> h_names <- list ("time", "flow", "seq", "ts", "x", "rtt",
> >>> "size")
> >>
> >>> pcks_file <- pipe ("grep ' P ' server.dat", "r")
> >>> pcks <- scan (pcks_file, what = h_types,
> >> comment.char = '#', fill =
> >> TRUE)
> >>
> >>> mat_df <- data.frame (pcks[1:2], pcks[5:9])
> >>> mat <- as.matrix (mat_df)
> >>> colnames (mat) <- h_names
> >>
> >>> class (mat)
> >> [1] "matrix"
> >>
> >>> z <- zoo (mat, mat [,"time"])
> >>
> >>> z
> >>> z
> >> time flow seq ts
> >> x rtt size
> >> 1.0009 1.000893 0.000000 0.000000 1.000893
> >> 1472.000000 0.000000 1472.000000
> >> 1.5145 1.514454 0.000000 1.000000 1.514454
> >> 2944.000000 0.513142 1472.000000
> >> 2.0151 2.015093 0.000000 2.000000 2.015093
> >> 2944.000000 0.513142 1472.000000
> >> 2.515 2.515025 0.000000 3.000000 2.515025
> >> 4806.000000 0.504488 1472.000000
> >> 2.822 2.821976 0.000000 4.000000 2.821976
> >> 5730.000000 0.496728 1472.000000
> >> [...]
> >>
> >>> as.ts (z)
> >> Error in if (del == 0 && to == 0) return(to) :
> >> missing value where TRUE/FALSE needed
> >>
> >> Any idea? Thanks for your help.
> >>
> >> Alvaro
> >>
> >>
> >> --
> >> Alvaro Saurin <alvaro.saurin at gmail.com> <saurin at dcs.gla.ac.uk>
> >>
> >>
> >>
> >>
>
> --
> Alvaro Saurin <alvaro.saurin at gmail.com> <saurin at dcs.gla.ac.uk>
>
>
>
>
More information about the R-help
mailing list