[R] R function to convert person-level observations to person-period observations
David Barron
dnbarron at gmail.com
Sat Jan 3 16:18:49 CET 2015
Your data are wrong. The 'event' variable (dead in your example) needs
to be 1 for cases that end in an event and 0 for spells that are
censored: yours is the other way around. If you change the 'dead'
variable to c(1,0,1) you will get the desired result.
If you really need to reverse the behaviour of the function, change the line
reve <- !data[, event]
to
reve <- data[, event]
David
On 3 January 2015 at 13:20, Muhuri, Pradip (SAMHSA/CBHSQ)
<Pradip.Muhuri at samhsa.hhs.gov> wrote:
> Hello,
>
> I was trying to convert person-level observations to person-period observations using an R custom function obtained from the UCLA web site (http://www.ats.ucla.edu/stat/r/faq/person_period.htm). Please see my reproducible example below. The function (PLPP) in the R script takes five arguments.
>
>
> 1) data (i.e., the data set to be converted)
>
> 2) id (i.e., the identifier for each observation)
>
> 3) period (i.e., number pf periods the person or observation was followed-up)
>
> 4) event (i.e., the variable that indicates whether the event occurred or not or whether the observation was censored (depending on which direction you are converting).
>
> 5) direction which "indicates whether the function should go from person-level to person-period or from person-period to person-level".
> On my example data set, the R script ran successfully. Based on 3 person-level observations (A died in year 2, B is censored in year 5, C died in year 3), I get 10 period-level observations - correct results. But the issue is that the value of the "dead" indicator variable is incorrect. I have a gut feeling that the function needs to tweaked a bit to get desired results.
>
>
> Correct results
> ID dead studyyrs
> 1 A 1 2
> 2 B 0 5
> 3 C 1 3
>
> Incorrect results - the "dead" column
>
> ID dead studyyrs
>
> 1 A 0 1
>
> 2 A 0 2
>
> 3 B 0 1
>
> 4 B 0 2
>
> 5 B 0 3
>
> 6 B 0 4
>
> 7 B 1 5
>
> 8 C 0 1
>
> 9 C 0 2
>
> 10 C 0 3
>
>
>
>
> Desired results
>
> ID dead studyyrs
>
> 1 A 0 1
>
> 2 A 1 2
>
> 3 B 0 1
>
> 4 B 0 2
>
> 5 B 0 3
>
> 6 B 0 4
>
> 7 B 0 5
>
> 8 C 0 1
>
> 9 C 0 2
>
> 10 C 1 3
>
>
> I would appreciate receiving your help or hints for resolving the issue. Thanks,
>
>
>
> ## Below is my reproducible code is shown below)
>
> ## Below is my data frame (3 observations)
> df <- data.frame( ID=LETTERS[1:3], dead=c(1,0,1), studyyrs=c(2,5,3) )
> df
>
> ## Person-Level Person-Period Converter Function - Source: http://www.ats.ucla.edu/stat/r/faq/person_period.htm
> PLPP <- function(data, id, period, event, direction = c("period", "level")) {
> ## Data Checking and Verification Steps
> stopifnot(is.matrix(data) || is.data.frame(data))
> stopifnot(c(id, period, event) %in% c(colnames(data), 1:ncol(data)))
>
> if (any(is.na(data[, c(id, period, event)]))) {
> stop("PLPP cannot currently handle missing data in the id, period, or event variables")
> }
>
> ## Do the conversion - Source: http://www.ats.ucla.edu/stat/r/faq/person_period.htm
> switch(match.arg(direction),
> period = {
> index <- rep(1:nrow(data), data[, period])
> idmax <- cumsum(data[, period])
> reve <- !data[, event]
> dat <- data[index, ]
> dat[, period] <- ave(dat[, period], dat[, id], FUN = seq_along)
> dat[, event] <- 0
> dat[idmax, event] <- reve},
> level = {
> tmp <- cbind(data[, c(period, id)], i = 1:nrow(data))
> index <- as.vector(by(tmp, tmp[, id],
> FUN = function(x) x[which.max(x[, period]), "i"]))
> dat <- data[index, ]
> dat[, event] <- as.integer(!dat[, event])
> })
>
> rownames(dat) <- NULL
> return(dat)
> }
>
> tpp <- PLPP(data = df, id = "ID", period = "studyyrs",
> event = "dead", direction = "period")
> tpp
>
>
>
> Pradip K. Muhuri,
> SAMHSA/CBHSQ
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list