[R] Interval censored Data in survreg() with zero values!
Geraldine Henningsen
ghenningsen at email.uni-kiel.de
Fri Dec 26 20:38:32 CET 2008
Hello again,
thank you very much for your help so far.
To be more specific, I generate a simplified data set that is similar to
my real world data:
set.seed( 123 )
data <- data.frame( x = runif( 200 ), y = NA )
for( i in 1:200 ){
data$y[ i ] <- rweibull( 1, 1, 70 + 10 * data$x[ i ] ) - 30
}
data$y[ data$y < 0 ] <- 0
data$y[ data$y > 100 ] <- 100
Applying an interval censored tobit model based on the normal
distribution works:
estNorm <- tobit( y ~ x, left = 0, right = 100, data = data )
Since my data are obviously not normally distributed, I tried the
Weibull distribution, but this does not work (as I wrote before).
estWeibull <- tobit( y ~ x, left = 0, right = 100, dist = "weibull",
data = data )
I have tried to implement Terry's suggestion.
> [...] Using Surv(t1, t2, type='interval2'), you can have
> a left censored observation where time of event < t: represented as (NA, t)
> a right censored observation where time of event >t: represented as (t, NA)
> an interval censored observations t1<=time <= t2 : represented as (t1,t2)
>
estWeibull2 <- survreg( Surv( ifelse( y == 0, NA, y ), ifelse( y == 100,
y, NA), type = "interval2" ) ~ x, data = data )
Is this correct?
My endogenous variable is not a time depending variable but percentages
which naturally are censored in the interval [0,100]. Unfortunately many
data points are 0 or 100 exactly. The rest of the data is asymmetrically
distributed. So I would like to apply a two-limit tobit, regressing the
percentage
(endogenous variable) on several explanatory variables.
Best Geraldine
More information about the R-help
mailing list