[R] complicated time series filtering issue

Eric Berger er|cjberger @end|ng |rom gm@||@com
Tue Apr 5 14:36:37 CEST 2022


For a different approach, use the Date column rather than the differences
column.
I assume the data has been put into the bc.df data frame (as Jim does,
above)

f <- function(v,m=10L) {
  w <- 1L
  while( (i <- tail(w,1)) < length(v))
    w <- c(w, match(TRUE,v[i:(i+m+1)] > v[i]+m )+(i-1))
  w
}
f(as.integer(as.Date(strptime(bc.df$Date,"%d-%b-%y"))))



On Tue, Apr 5, 2022 at 1:16 AM Jim Lemon <drjimlemon using gmail.com> wrote:

> Hi Brian,
> Perhaps this:
>
> bc.df<-read.table(text="Date   INDIVIDUAL DATENUMBER LENGTH length.prev
> interval
> 12-May-04 57084544        133         682.4           NA       NA
> 28-Sep-04 57084544        272         724.8        682.4      139
> 30-Sep-04 57084544        274         740.8        724.8        2
> 7-Oct-04 57084544        281         745.4        740.8        7
> 22-Nov-04 57084544        327         780.2        745.4       46
> 27-Jan-05 57084544        393         817.2        780.2       66
> 8-Mar-05 57084544        433         834.1        817.2       40
> 2-Jul-05 57084544        549         876.3        834.1      116
> 6-Jul-05 57084544        553         871.5        876.3        4
> 4-Aug-05 57084544        582         887.5        871.5       29
> 28-Dec-05 57084544        728         921.8        887.5      146
> 31-Jan-06 57084544        762         936.8        921.8       34
> 27-Feb-06 57084544        789         962.4        936.8       27
> 21-Nov-06 57084544       1056         972.3        962.4      267
> 30-Mar-07 57084544       1185        1007.2        972.3      129
> 23-Apr-07 57084544       1209        1009.1       1007.2       24
> 22-May-07 57084544       1238         991.6       1009.1       29
> 23-May-07 57084544       1239        1015.9        991.6        1
> 16-Jul-07 57084544       1293        1006.5       1015.9       54
> 9-Aug-07 57084544       1317        1013.0       1006.5       24
> 27-Aug-07 57084544       1335        1013.0       1013.0       18
> 29-Jul-08 57084544       1672        1021.5       1013.0      337
> 30-Jul-08 57084544       1673         984.3       1021.5        1
> 31-Jul-08 57084544       1674        1008.5        984.3        1
> 10-Aug-08 57084544       1684        1002.8       1008.5       10
> 22-Oct-08 57084544       1757         977.6       1002.8       73
> 2-Dec-08 57084544       1798        1000.6        977.6       41",
> stringsAsFactors=FALSE,header=TRUE)
> min_interval<-function(x,minint=10) {
>  indx<-1
>  cumint<-0
>  for(i in 2:length(x)) {
>   cumint<-cumint+x[i]
>   if(cumint > minint) {
>    indx<-c(indx,i)
>    cumint<-0
>   }
>  }
>  return(indx)
> }
> min_interval(bc.df$interval)
>
> Jim
>
> On Tue, Apr 5, 2022 at 7:31 AM Ebert,Timothy Aaron <tebert using ufl.edu> wrote:
> >
> > I think the idea is more
> > for (i in 2:nrow(x)){
> > ifelse(x[i]-x[i-1] >10) {keep x[i], delete x[i]]
> > }
> >
> > I am not quite clear on the correct code for "keep" or "delete."
> >
> > One could try
> > for (i in 2:nrow(x)){
> > x$new[i] <- x[i]-x[i-1]
> > }
> > x <- x %>% filter(new>=10)
> >
> > This only works if consecutive sample dates are 10 or more days apart.
> You could add an else if that would accumulate days, and if successful
> reset the clock.
> >
> > Tim
> > -----Original Message-----
> > From: R-help <r-help-bounces using r-project.org> On Behalf Of Bert Gunter
> > Sent: Monday, April 4, 2022 5:04 PM
> > To: Cade, Brian S <cadeb using usgs.gov>
> > Cc: r-help using r-project.org
> > Subject: Re: [R] complicated time series filtering issue
> >
> > [External Email]
> >
> > Like this?
> >
> > winnow <- function(x, int=5){
> >    keep <- x[1]
> >    remaining <- x[-1]
> >    while (length(remaining))
> >    {
> >       nxt <- tail(keep,1) + int
> >       if(length(remaining) ==1 ||
> >          all(remaining < nxt))break
> >       remaining <- remaining[remaining >tail(keep,1) + int]
> >       keep <- c(keep,remaining[1])
> >    }
> >    keep
> > }
> >
> > > x
> >  [1]  1  2  5  7  8  9 15 16 17 19 20 21 28 35 37 41 43 45 46 50
> > > winnow(x,7)
> > [1]  1  9 17 28 37 45
> > > winnow(x,5)
> > [1]  1  7 15 21 28 35 41 50
> >
> > Cheers,
> > Bert
> >
> > "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> > On Mon, Apr 4, 2022 at 12:56 PM Cade, Brian S via R-help <
> r-help using r-project.org> wrote:
> > >
> > > Hello:  I have an issue with filtering in a time series of animal
> > > growth data that seems conceptually simple but I have not come up with
> > > effective code to implement this.  I have temporal sequences of
> > > lengths by individuals and I want to retain only those data that are
> > > >10 days apart sequentially within an individuals records.  I can
> > > readily compute intervals between successive dates by individual using
> > > data.table() and its by = INDIVIDUAL functionality.  See example data
> > > for one individual below.  But what currently eludes me in processing
> > > this is how to recognize for example that deleting the 2nd and 3rd
> > > rows is required because the totality of their time interval is 9
> > > days, deleting 8th record with 4 days is required,  deleting 17th
> > > record with 1 day is required, deleting 22nd and 23rd records is
> > > required because their sum is 2 days, but we do not delete 24th record
> > > of 10 days because the sum of previous 2 records deleted and this one
> > > is now 12 days.  Each individual can have ve
> >  ry
> > >   different patterns of these sort of sequences.  These sequences are
> easy to look at and determine what needs to be done but writing effective
> code to accomplish this filtering seems to require some functionality that
> I am currently missing.
> > >
> > > Any suggestions would be greatly appreciated.
> > >
> > >          Date   INDIVIDUAL DATENUMBER LENGTH length.prev interval
> > > 228 12-May-04 57084544        133         682.4           NA       NA
> > > 229 28-Sep-04 57084544        272         724.8        682.4      139
> > > 230 30-Sep-04 57084544        274         740.8        724.8        2
> > > 231  7-Oct-04 57084544        281         745.4        740.8        7
> > > 232 22-Nov-04 57084544        327         780.2        745.4       46
> > > 233 27-Jan-05 57084544        393         817.2        780.2       66
> > > 234  8-Mar-05 57084544        433         834.1        817.2       40
> > > 235  2-Jul-05 57084544        549         876.3        834.1      116
> > > 236  6-Jul-05 57084544        553         871.5        876.3        4
> > > 237  4-Aug-05 57084544        582         887.5        871.5       29
> > > 238 28-Dec-05 57084544        728         921.8        887.5      146
> > > 239 31-Jan-06 57084544        762         936.8        921.8       34
> > > 240 27-Feb-06 57084544        789         962.4        936.8       27
> > > 241 21-Nov-06 57084544       1056         972.3        962.4      267
> > > 242 30-Mar-07 57084544       1185        1007.2        972.3      129
> > > 243 23-Apr-07 57084544       1209        1009.1       1007.2       24
> > > 244 22-May-07 57084544       1238         991.6       1009.1       29
> > > 245 23-May-07 57084544       1239        1015.9        991.6        1
> > > 246 16-Jul-07 57084544       1293        1006.5       1015.9       54
> > > 247  9-Aug-07 57084544       1317        1013.0       1006.5       24
> > > 248 27-Aug-07 57084544       1335        1013.0       1013.0       18
> > > 249 29-Jul-08 57084544       1672        1021.5       1013.0      337
> > > 250 30-Jul-08 57084544       1673         984.3       1021.5        1
> > > 251 31-Jul-08 57084544       1674        1008.5        984.3        1
> > > 252 10-Aug-08 57084544       1684        1002.8       1008.5       10
> > > 253 22-Oct-08 57084544       1757         977.6       1002.8       73
> > > 254  2-Dec-08 57084544       1798        1000.6        977.6       41
> > >
> > >
> > >
> > > Brian
> > >
> > >
> > >
> > > Brian S. Cade, PhD
> > >
> > > U. S. Geological Survey
> > > Fort Collins Science Center
> > > 2150 Centre Ave., Bldg. C
> > > Fort Collins, CO  80526-8818
> > >
> > > email:  cadeb using usgs.gov<mailto:brian_cade using usgs.gov>
> > > tel:  970 226-9326
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mail
> > > man_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAs
> > > Rzsn7AkP-g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxgu
> > > -vCEJab&s=PG1chCZY6eQzSdtSlvChVVVt0HXVDG1bgBkJMQ8wk1A&e=
> > > PLEASE do read the posting guide
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.or
> > > g_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeA
> > > sRzsn7AkP-g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxg
> > > u-vCEJab&s=D_bzOVjWanUgYD_zJq-IS8EObMKBmC5Q5D-a_IHxMAA&e=
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxgu-vCEJab&s=PG1chCZY6eQzSdtSlvChVVVt0HXVDG1bgBkJMQ8wk1A&e=
> > PLEASE do read the posting guide
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=ZfVdnGSALzyajo_d1U09NJs3RCXcx5NwQ2PZ9A9zwEnVYnexn4toTyxgu-vCEJab&s=D_bzOVjWanUgYD_zJq-IS8EObMKBmC5Q5D-a_IHxMAA&e=
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list