[R] complicated time series filtering issue
Kimmo Elo
k|mmo@e|o @end|ng |rom utu@||
Tue Apr 5 15:06:11 CEST 2022
Hi!
Here is an alternative solution using DATENUMBER:
i<-2
while (i<nrow(bc.df)) {
if (bc.df$DATENUMBER[i]-bc.df$DATENUMBER[i-1]<=11) {
bc.df<-bc.df[-i,]
} else {
bc.df$interval[i]<-bc.df$DATENUMBER[i]-bc.df$DATENUMBER[i-1]
i<-i+1
}
}
Best,
Kimmo
ti, 2022-04-05 kello 15:36 +0300, Eric Berger kirjoitti:
> For a different approach, use the Date column rather than the
> differences
> column.
> I assume the data has been put into the bc.df data frame (as Jim
> does,
> above)
>
> f <- function(v,m=10L) {
> w <- 1L
> while( (i <- tail(w,1)) < length(v))
> w <- c(w, match(TRUE,v[i:(i+m+1)] > v[i]+m )+(i-1))
> w
> }
> f(as.integer(as.Date(strptime(bc.df$Date,"%d-%b-%y"))))
>
>
>
> On Tue, Apr 5, 2022 at 1:16 AM Jim Lemon <drjimlemon using gmail.com>
> wrote:
>
> > Hi Brian,
> > Perhaps this:
> >
> > bc.df<-read.table(text="Date INDIVIDUAL DATENUMBER LENGTH
> > length.prev
> > interval
> > 12-May-04 57084544 133 682.4 NA NA
> > 28-Sep-04 57084544 272 724.8 682.4 139
> > 30-Sep-04 57084544 274 740.8 724.8 2
> > 7-Oct-04 57084544 281 745.4 740.8 7
> > 22-Nov-04 57084544 327 780.2 745.4 46
> > 27-Jan-05 57084544 393 817.2 780.2 66
> > 8-Mar-05 57084544 433 834.1 817.2 40
> > 2-Jul-05 57084544 549 876.3 834.1 116
> > 6-Jul-05 57084544 553 871.5 876.3 4
> > 4-Aug-05 57084544 582 887.5 871.5 29
> > 28-Dec-05 57084544 728 921.8 887.5 146
> > 31-Jan-06 57084544 762 936.8 921.8 34
> > 27-Feb-06 57084544 789 962.4 936.8 27
> > 21-Nov-06 57084544 1056 972.3 962.4 267
> > 30-Mar-07 57084544 1185 1007.2 972.3 129
> > 23-Apr-07 57084544 1209 1009.1 1007.2 24
> > 22-May-07 57084544 1238 991.6 1009.1 29
> > 23-May-07 57084544 1239 1015.9 991.6 1
> > 16-Jul-07 57084544 1293 1006.5 1015.9 54
> > 9-Aug-07 57084544 1317 1013.0 1006.5 24
> > 27-Aug-07 57084544 1335 1013.0 1013.0 18
> > 29-Jul-08 57084544 1672 1021.5 1013.0 337
> > 30-Jul-08 57084544 1673 984.3 1021.5 1
> > 31-Jul-08 57084544 1674 1008.5 984.3 1
> > 10-Aug-08 57084544 1684 1002.8 1008.5 10
> > 22-Oct-08 57084544 1757 977.6 1002.8 73
> > 2-Dec-08 57084544 1798 1000.6 977.6 41",
> > stringsAsFactors=FALSE,header=TRUE)
> > min_interval<-function(x,minint=10) {
> > indx<-1
> > cumint<-0
> > for(i in 2:length(x)) {
> > cumint<-cumint+x[i]
> > if(cumint > minint) {
> > indx<-c(indx,i)
> > cumint<-0
> > }
> > }
> > return(indx)
> > }
> > min_interval(bc.df$interval)
> >
> > Jim
> >
> > On Tue, Apr 5, 2022 at 7:31 AM Ebert,Timothy Aaron <tebert using ufl.edu>
> > wrote:
> > > I think the idea is more
> > > for (i in 2:nrow(x)){
> > > ifelse(x[i]-x[i-1] >10) {keep x[i], delete x[i]]
> > > }
> > >
> > > I am not quite clear on the correct code for "keep" or "delete."
> > >
> > > One could try
> > > for (i in 2:nrow(x)){
> > > x$new[i] <- x[i]-x[i-1]
> > > }
> > > x <- x %>% filter(new>=10)
> > >
> > > This only works if consecutive sample dates are 10 or more days
> > > apart.
> > You could add an else if that would accumulate days, and if
> > successful
> > reset the clock.
> > > Tim
> > > -----Original Message-----
> > > From: R-help <r-help-bounces using r-project.org> On Behalf Of Bert
> > > Gunter
> > > Sent: Monday, April 4, 2022 5:04 PM
> > > To: Cade, Brian S <cadeb using usgs.gov>
> > > Cc: r-help using r-project.org
> > > Subject: Re: [R] complicated time series filtering issue
> > >
> > > [External Email]
> > >
> > > Like this?
> > >
> > > winnow <- function(x, int=5){
> > > keep <- x[1]
> > > remaining <- x[-1]
> > > while (length(remaining))
> > > {
> > > nxt <- tail(keep,1) + int
> > > if(length(remaining) ==1 ||
> > > all(remaining < nxt))break
> > > remaining <- remaining[remaining >tail(keep,1) + int]
> > > keep <- c(keep,remaining[1])
> > > }
> > > keep
> > > }
> > >
> > > > x
> > > [1] 1 2 5 7 8 9 15 16 17 19 20 21 28 35 37 41 43 45 46 50
> > > > winnow(x,7)
> > > [1] 1 9 17 28 37 45
> > > > winnow(x,5)
> > > [1] 1 7 15 21 28 35 41 50
> > >
> > > Cheers,
> > > Bert
> > >
> > > "The trouble with having an open mind is that people keep coming
> > > along
> > and sticking things into it."
> > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
> > > )
> > >
> > > On Mon, Apr 4, 2022 at 12:56 PM Cade, Brian S via R-help <
> > r-help using r-project.org> wrote:
> > > > Hello: I have an issue with filtering in a time series of
> > > > animal
> > > > growth data that seems conceptually simple but I have not come
> > > > up with
> > > > effective code to implement this. I have temporal sequences of
> > > > lengths by individuals and I want to retain only those data
> > > > that are
> > > > > 10 days apart sequentially within an individuals records. I
> > > > > can
> > > > readily compute intervals between successive dates by
> > > > individual using
> > > > data.table() and its by = INDIVIDUAL functionality. See
> > > > example data
> > > > for one individual below. But what currently eludes me in
> > > > processing
> > > > this is how to recognize for example that deleting the 2nd and
> > > > 3rd
> > > > rows is required because the totality of their time interval is
> > > > 9
> > > > days, deleting 8th record with 4 days is required, deleting
> > > > 17th
> > > > record with 1 day is required, deleting 22nd and 23rd records
> > > > is
> > > > required because their sum is 2 days, but we do not delete 24th
> > > > record
> > > > of 10 days because the sum of previous 2 records deleted and
> > > > this one
> > > > is now 12 days. Each individual can have ve
> > > ry
> > > > different patterns of these sort of sequences. These
> > > > sequences are
> > easy to look at and determine what needs to be done but writing
> > effective
> > code to accomplish this filtering seems to require some
> > functionality that
> > I am currently missing.
> > > > Any suggestions would be greatly appreciated.
> > > >
> > > > Date INDIVIDUAL DATENUMBER LENGTH length.prev
> > > > interval
> > > > 228 12-May-04
> > > > 57084544 133 682.4 NA NA
> > > > 229 28-Sep-04
> > > > 57084544 272 724.8 682.4 139
> > > > 230 30-Sep-04
> > > > 57084544 274 740.8 724.8 2
> > > > 231 7-Oct-04
> > > > 57084544 281 745.4 740.8 7
> > > > 232 22-Nov-04
> > > > 57084544 327 780.2 745.4 46
> > > > 233 27-Jan-05
> > > > 57084544 393 817.2 780.2 66
> > > > 234 8-Mar-05
> > > > 57084544 433 834.1 817.2 40
> > > > 235 2-Jul-05
> > > > 57084544 549 876.3 834.1 116
> > > > 236 6-Jul-05
> > > > 57084544 553 871.5 876.3 4
> > > > 237 4-Aug-05
> > > > 57084544 582 887.5 871.5 29
> > > > 238 28-Dec-05
> > > > 57084544 728 921.8 887.5 146
> > > > 239 31-Jan-06
> > > > 57084544 762 936.8 921.8 34
> > > > 240 27-Feb-06
> > > > 57084544 789 962.4 936.8 27
> > > > 241 21-Nov-06
> > > > 57084544 1056 972.3 962.4 267
> > > > 242 30-Mar-07
> > > > 57084544 1185 1007.2 972.3 129
> > > > 243 23-Apr-07
> > > > 57084544 1209 1009.1 1007.2 24
> > > > 244 22-May-07
> > > > 57084544 1238 991.6 1009.1 29
> > > > 245 23-May-07
> > > > 57084544 1239 1015.9 991.6 1
> > > > 246 16-Jul-07
> > > > 57084544 1293 1006.5 1015.9 54
> > > > 247 9-Aug-07
> > > > 57084544 1317 1013.0 1006.5 24
> > > > 248 27-Aug-07
> > > > 57084544 1335 1013.0 1013.0 18
> > > > 249 29-Jul-08
> > > > 57084544 1672 1021.5 1013.0 337
> > > > 250 30-Jul-08
> > > > 57084544 1673 984.3 1021.5 1
> > > > 251 31-Jul-08
> > > > 57084544 1674 1008.5 984.3 1
> > > > 252 10-Aug-08
> > > > 57084544 1684 1002.8 1008.5 10
> > > > 253 22-Oct-08
> > > > 57084544 1757 977.6 1002.8 73
> > > > 254 2-Dec-08
> > > > 57084544 1798 1000.6 977.6 41
> > > >
> > > >
> > > >
> > > > Brian
> > > >
> > > >
> > > >
> > > > Brian S. Cade, PhD
> > > >
> > > > U. S. Geological Survey
> > > > Fort Collins Science Center
> > > > 2150 Centre Ave., Bldg. C
> > > > Fort Collins, CO 80526-8818
> > > >
> > > > email: cadeb using usgs.gov<mailto:brian_cade using usgs.gov>
> > > > tel: 970 226-9326
> > > >
> > > >
> > > > [[alternative HTML version deleted]]
> > > >
> > > > ______________________________________________
> > > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
> > > > see
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mail
> > > > man_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-
> > > > zLMB3EPkvcnVg&r=9PEhQh2kVeAs
> > > > Rzsn7AkP-
> > > > g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxgu
> > > > -vCEJab&s=PG1chCZY6eQzSdtSlvChVVVt0HXVDG1bgBkJMQ8wk1A&e=
> > > > PLEASE do read the posting guide
> > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.or
> > > > g_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-
> > > > zLMB3EPkvcnVg&r=9PEhQh2kVeA
> > > > sRzsn7AkP-
> > > > g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxg
> > > > u-vCEJab&s=D_bzOVjWanUgYD_zJq-IS8EObMKBmC5Q5D-a_IHxMAA&e=
> > > > and provide commented, minimal, self-contained, reproducible
> > > > code.
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxgu-vCEJab&s=PG1chCZY6eQzSdtSlvChVVVt0HXVDG1bgBkJMQ8wk1A&e=
> > > PLEASE do read the posting guide
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxgu-vCEJab&s=D_bzOVjWanUgYD_zJq-IS8EObMKBmC5Q5D-a_IHxMAA&e=
> > > and provide commented, minimal, self-contained, reproducible
> > > code.
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible
> > > code.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list