[R] AFTREG with ID argument

Thu Feb 11 17:30:07 CET 2010

On Feb 11, 2010, at 5:58 AM, Philipp Rappold wrote:

> Göran, thanks!
>
> One more thing that I found: As soon as you have at least one NA in  
> the independent vars, the trick that you mentioned does not work  
> anymore. Example:
>
> > testdata
>  start stop censor groupvar      var1
> 1     0    1      0        1 0.1284928
> 2     1    2      0        1 0.4896125
> 3     2    3      0        1 0.7012899
> 4     3    4      0        1        NA
> 5     0    1      0        2 0.7964361
> 6     1    2      0        2 0.8466039
> 7     2    3      1        2 0.2234271
>
> > aftreg(Surv(start, stop, censor)~var1, data=testdata, id=testdata 
> $groupvar)
> Error in order(id, Y[, 1]) : Different length of arguments (* I  
> translated this from the German Output *)
>
> Do you think there is a simple hack which excludes all subjects that  
> have at least on NA in their independent vars? If it was only one  
> dependent var it would probably be easy by just using subset, but I  
> have lots of different combinations of vars that I'd like to test ;)
>

I don't know if it's a "hack", but there are a set of functions that  
perform such subsetting:

?na.omit

There is a parameter that would accomplish that goal inside aftreg.  
You may want to check what your defaults are for na.action.

-- 
David.

> Best
> Philipp
>
> PS: Conerning the benmark: For a large dataset (~ 1600 observations  
> on ~300 subjects) processing takes about 40 seconds (core 2 duo @  
> 2.46 GHz, T9300). Interestingly, processing the testdata-set above  
> with only 7 observations on 2 subjects takes 2 minutes...
>
> Göran Broström wrote:
>> Philipp Rappold wrote:
>>> Dear all,
>>>
>>> I have some trouble using the "id"-argument with aftreg  
>>> (accelerated failure time regression analysis from the eha library).
>>>
>>> As far as I understand it, the id argument is used to group  
>>> individuals together if there are time-varying covariates and the  
>>> data is arranged in counting process style.
>>>
>>> Unfortunately, i cannot figure out how to use the "id"-argument.  
>>> The most straight-forward way would be to simply state the  
>>> grouping variable, but it throws an error. I've included an  
>>> example below: the dataframe for regression is called "test", with  
>>> the grouping variable "person".
>>>
>>> > test
>>>  start end censor person var1
>>> 1     0   1      0      1  0.5
>>> 2     1   2      0      1  0.4
>>> 3     2   3      0      1  0.6
>>> 4     3   4      1      1 -0.3
>>> 5     0   1      0      2  0.6
>>> 6     1   2      0      2  0.7
>>> 7     2   3      0      2  0.6
>>>
>>> > fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=person)
>>> Error in order(id, Y[, 1]) : argument 1 is not a vector
>> You have caught the _function_ 'person' (package: utils) instead of  
>> the variable 'person' in the data frame. That explains the odd  
>> error message. If you change the variable name to, e.g., "ID",  
>> you'll get the error message
>> Error in order(id, Y[, 1]) : object 'id' not found
>> which would hint you in the right direction. You need to specify   
>> 'id' by a full name, in your case 'test$person'. This is of course  
>> a deficiency in the interface of aftreg. I will fix it asap.
>> So the temporary fix is 'id = test$person'.
>> Thanks for the report,
>> Göran
>>>
>>> > fit <- aftreg(Surv(start, end, censor)~var1, data=test,  
>>> id=test["person"])
>>> Error in `[.data.frame`(id, ord) : undefined columns selected
>>>
>>>
>>>
>>> What would be the correct way to fit this example model?
>>>
>>> Thanks + all the best
>>> Philipp
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT