[R] data recoding problem

Williams Scott Scott.Williams at petermac.org
Mon Apr 23 10:14:20 CEST 2007

Hi R experts,

I have a data recoding problem I cant get my head around - I am not that
great at the subsetting syntax. I have a dataset of longitudinal
toxicity data (for multistate modelling) for which I want to also want
to do a simple Kaplan-Meier curve of the time to first toxic event.

The data for 2 cases presently looks like this (one with an event, the
other without), with id representing each person on study, and follow-up
time and status:

> tox

 id      t       event

 PMC011  0.000     0
 PMC011  3.154     0
 PMC011  5.914     0
 PMC011 12.353     0
 PMC011 18.103     1
 PMC011 24.312     0
 PMC011 30.029     0
 PMC011 47.967     0
 PMC011 96.953     0
 PMC016  0.000     0
 PMC016  3.943     0
 PMC016  5.782     0
 PMC016 11.762     0
 PMC016 17.741     0
 PMC016 23.951     0
 PMC016 28.353     0
 PMC016 44.747     0
 PMC016 89.692     0 

So what I need is an output in the same column format, containing each
of the unique values of id:

PMC011 18.103     1
PMC016 89.692     0

In my head, I would do this by looking at each unique value of id (each
unique case), look down the event data of each of these cases - if there
is no event (event==0), then I would go to the time column (t) and find
the max value and paste this time along with a 0 for event. If there
were an event, I would then need to find the minimum time associated
with an event to paste across with the event marker. I am sure someone
out there can point me in the right direction to do this without tedious
and slow loops. Any help greatly appreciated.



Dr. Scott Williams


Radiation Oncologist

Peter MacCallum Cancer Centre

Melbourne, Australia

ph +61 3 9656 1111

fax +61 3 9656 1424

scott.williams at petermac.org

More information about the R-help mailing list