[R] Counting consecutive events in R
Sarah Goslee
sarah.goslee at gmail.com
Thu May 14 16:25:23 CEST 2015
Assuming I understand the problem correctly, you want to check for
runs of at least length five where both Score and Test_desc assume
particular values. You don't care where they are or what other data
are associated, you just want to know if at least one such run exists
in your data frame.
Here's a function that does that:
checkruns <- function(testdata) {
test1 <- ifelse(testdata$Score > 0 & testdata$Type_Desc == 1 &
!is.na(testdata$Type_Desc), 1, 0)
test0 <- ifelse(testdata$Score > 0 & testdata$Type_Desc == 0 &
!is.na(testdata$Type_Desc), 1, 0)
test1.rle <- rle(test1)
test0.rle <- rle(test0)
if(any(test1.rle$lengths >= 5 & test1.rle$values == 1))
cat("Type_high\n")
if(any(test0.rle$lengths >= 5 & test0.rle$values == 1))
cat("Type_low\n")
invisible()
}
Sarah
On Thu, May 14, 2015 at 8:16 AM, Abhinaba Roy <abhinabaroy09 at gmail.com> wrote:
> Hi,
>
> I have the following dataframe
>
> structure(list(Type = c("QRS", "QRS", "QRS", "QRS", "QRS", "QRS",
> "QRS", "QRS", "QRS", "QRS", "QRS", "QRS", "RR", "RR", "RR", "PP",
> "PP", "PP", "PP", "PP", "PP", "PP", "PP", "PP", "QTc", "QTc",
> "QTc", "QTc", "QTc", "QTc", "QTc", "QTc", "QTc", "QTc", "QTc",
> "QTc", "QTc", "QTc", "QTc"), Time_Point_Start = c("2015-04-01 14:57:15.0.0312",
> "2015-04-01 14:57:15.0.7839", "2015-04-01 14:57:16.0.5343",
> "2015-04-01 14:57:17.0.2573",
> "2015-04-01 14:57:18.0.0234", "2015-04-01 14:57:18.0.7722",
> "2015-04-01 14:57:19.0.5265",
> "2015-04-01 14:57:24.0.0195", "2015-04-01 14:57:24.0.7839",
> "2015-04-01 14:57:25.0.5343",
> "2015-04-01 14:57:26.0.2768", "2015-04-01 14:57:27.0.0273",
> "2015-04-01 14:58:03.0.0702",
> "2015-04-01 14:58:03.0.8190", "2015-04-01 14:58:04.0.5694",
> "2015-04-01 14:57:58.0.4134",
> "2015-04-01 14:57:59.0.1637", "2015-04-01 14:57:59.0.9126",
> "2015-04-01 14:58:00.0.6630",
> "2015-04-01 14:58:01.0.4134", "2015-04-01 14:58:02.0.1637",
> "2015-04-01 14:58:02.0.9126",
> "2015-04-01 14:58:03.0.6630", "2015-04-01 14:58:04.0.4134",
> "2015-04-01 14:57:07.0.4212",
> "2015-04-01 14:57:08.0.1715", "2015-04-01 14:57:08.0.9204",
> "2015-04-01 14:57:09.0.6864",
> "2015-04-01 14:57:10.0.4368", "2015-04-01 14:57:11.0.1871",
> "2015-04-01 14:57:11.0.9360",
> "2015-04-01 14:57:12.0.6591", "2015-04-01 14:57:13.0.4251",
> "2015-04-01 14:57:14.0.1754",
> "2015-04-01 14:57:14.0.9243", "2015-04-01 14:57:15.0.6903",
> "2015-04-01 14:57:16.0.4407",
> "2015-04-01 14:57:17.0.1676", "2015-04-01 14:57:17.0.9321"),
> Time_Point_End = c("2015-04-01 14:57:15.0.0858", "2015-04-01
> 14:57:15.0.8346",
> "2015-04-01 14:57:16.0.6006", "2015-04-01 14:57:17.0.0351",
> "2015-04-01 14:57:18.0.1403", "2015-04-01 14:57:18.0.8385",
> "2015-04-01 14:57:19.0.5889", "2015-04-01 14:57:24.0.0858",
> "2015-04-01 14:57:24.0.8346", "2015-04-01 14:57:25.0.5772",
> "2015-04-01 14:57:26.0.3939", "2015-04-01 14:57:27.0.0936",
> "2015-04-01 14:58:03.0.8190", "2015-04-01 14:58:04.0.5694",
> "2015-04-01 14:58:05.0.3197", "2015-04-01 14:57:59.0.1637",
> "2015-04-01 14:57:59.0.9126", "2015-04-01 14:58:00.0.6630",
> "2015-04-01 14:58:01.0.4134", "2015-04-01 14:58:02.0.1637",
> "2015-04-01 14:58:02.0.9126", "2015-04-01 14:58:03.0.6630",
> "2015-04-01 14:58:04.0.4134", "2015-04-01 14:58:05.0.1793",
> "2015-04-01 14:57:07.0.8775", "2015-04-01 14:57:08.0.6435",
> "2015-04-01 14:57:09.0.3705", "2015-04-01 14:57:10.0.1209",
> "2015-04-01 14:57:10.0.8697", "2015-04-01 14:57:11.0.6201",
> "2015-04-01 14:57:12.0.3861", "2015-04-01 14:57:13.0.1364",
> "2015-04-01 14:57:13.0.8853", "2015-04-01 14:57:14.0.6513",
> "2015-04-01 14:57:15.0.4017", "2015-04-01 14:57:16.0.1248",
> "2015-04-01 14:57:16.0.9165", "2015-04-01 14:57:17.0.6162",
> "2015-04-01 14:57:18.0.3900"), Value = c(0.0546, 0.0507,
> 0.0663, 0.0936, 0.117, 0.0663, 0.0624, 0.0663, 0.0507, 0.0429,
> 0.117, 0.0663, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488,
> 0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7644, 0.033103481,
> 0.034056449, 0.032367699, 0.031000613, 0.031405867, 0.031241866,
> 0.032367699, 0.034337907, 0.033125921, 0.034337907, 0.034337907,
> 0.031241866, 0.034337907, 0.032367699, 0.032930616), Score = c(0L,
> 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L,
> 0L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Type_Desc = c(NA, NA, NA,
> NA, 1L, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 1L,
> 1L, 1L, 1L, 1L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L), Pat_id = c(4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L)), .Names = c("Type", "Time_Point_Start", "Time_Point_End",
> "Value", "Score", "Type_Desc", "Pat_id"), class = "data.frame",
> row.names = c(NA,
> -39L))
>
>
> For each unique value in column 'Type' , I want to check for
> consecutive 5 rows (if any) of 'Score' > 0.
>
> Now, if there are five consecutive rows with Score > 0 and 'Type_Desc'
> = 0, then we print "Type_low" , else if
>
> 'Type_Desc' = 1, we print "Type_high". The search should end once 5
> consecutive rows have been found.
>
> So, for this data frame we will have two statements as follows,
>
>
> 1.PP_high
>
> (reason - consecutive 5 rows of score > 0 and
>
> 'Type_Desc' = 1 )
>
> 2.QTc_low
> (reason - consecutive 5 rows of score > 0 and
>
> 'Type_Desc' = 0 )
>
> How can this problem tackled in R?
>
> Thanks,
>
> Abhinaba
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org
More information about the R-help
mailing list