[R] flag records
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Apr 27 07:53:04 CEST 2022
Hello,
Maybe something like the following will do it.
In the ave function, don't forget that diff returns a vector of a
different length, one less element. So combine with an initial zero.
Then 1 + FALSE/TRUE equals 1/2 and subset the target vector c("Y", "X")
with these indices.
i_ddiff <- with(DF3, ave(as.numeric(ddate), State, name, day, FUN = \(x)
c(0L, diff(x))) < 50)
DF3$ddiff <- c("Y", "X")[1L + i_ddiff]
An alternative is to assign a default "Y" to the new column and then
assign "X" where the condition is TRUE. This is easier to read.
DF3$ddiff <- "Y"
DF3$ddiff[i_ddiff] <- "X"
Hope this helps,
Rui Barradas
Às 23:17 de 26/04/2022, Val escreveu:
> Hi All,
>
> I want to flag a record based on the following condition.
> The variables in the sample data are
> State, name, day, text, ddate
>
> Sort the data by State, name, day ddate,
>
> Within State, name, day
> assign consecutive number for each row
> find the date difference between consecutive rows,
> if the difference is less than 50 days and the text string in
> previous and current rows are the same then flag the record as X,
> otherwise Y.
>
> Here is sample data and my attempt,
>
> DF<-read.table(text="State name day text ddate
> CA A 1 xch 2014/09/16
> CA A 2 xck 2015/5/29
> CA A 2 xck 2015/6/18
> CA A 2 xcm 2015/8/3
> CA A 2 xcj 2015/8/26
> FL B 3 xcu 2017/7/23
> FL B 3 xcl 2017/7/03
> FL B 3 xmc 2017/7/26
> FL B 3 xca 2017/3/17
> FL B 3 xcb 2017/4/8
> FL B 4 xhh 2017/3/17
> FL B 4 xhh 2017/1/29",header=TRUE)
>
> DF$ddate <- as.Date (as.Date(DF$ddate), format="%Y/%m/%d" )
> DF3 <- DF[order(DF$State,DF$name,DF$day,xtfrm(DF$ddate)), ]
> DF3$C <- with(DF3, ave(State, name, day, FUN = seq_along))
> DF3$diff <- with(DF3, ave(as.integer(ddate), State, name, day,
> FUN = function(x) x - x[1]))
>
> I stopped here, how do I evaluate the previous and the current rows
> text string and date difference?
>
> Desired result,
>
>
> State name day text ddate C diff flag
> 1 CA A 1 xch 2014-09-16 1 0 y
> 2 CA A 2 xck 2015-05-29 1 0 y
> 3 CA A 2 xck 2015-06-18 2 20 x
> 4 CA A 2 xcm 2015-08-03 3 66 y
> 5 CA A 2 xcj 2015-08-26 4 89 y
> 9 FL B 3 xca 2017-03-17 1 0 y
> 10 FL B 3 xcb 2017-04-08 2 22 y
> 7 FL B 3 xcl 2017-07-03 3 108 y
> 6 FL B 3 xcu 2017-07-23 4 128 y
> 8 FL B 3 xmc 2017-07-26 5 131 y
> 12 FL B 4 xhh 2017-01-29 1 0 y
> 11 FL B 4 xhh 2017-03-17 2 47 x
>
>
>
> Thank you,
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list