[R] sciplot question

Tue May 26 10:07:38 CEST 2009

On May 26, 2009, at 4:37 , Frank E Harrell Jr wrote:

> Manuel Morales wrote:
>> On Mon, 2009-05-25 at 06:22 -0500, Frank E Harrell Jr wrote:
>>> Jarle Bjørgeengen wrote:
>>>> On May 24, 2009, at 4:42 , Frank E Harrell Jr wrote:
>>>>
>>>>> Jarle Bjørgeengen wrote:
>>>>>> On May 24, 2009, at 3:34 , Frank E Harrell Jr wrote:
>>>>>>> Jarle Bjørgeengen wrote:
>>>>>>>> Great,
>>>>>>>> thanks Manuel.
>>>>>>>> Just for curiosity, any particular reason you chose standard  
>>>>>>>> error , and not confidence interval as the default (the  
>>>>>>>> naming of the plotting functions associates closer to the  
>>>>>>>> confidence interval .... ) error indication .
>>>>>>>> - Jarle Bjørgeengen
>>>>>>>> On May 24, 2009, at 3:02 , Manuel Morales wrote:
>>>>>>>>> You define your own function for the confidence intervals.  
>>>>>>>>> The function
>>>>>>>>> needs to return the two values representing the upper and  
>>>>>>>>> lower CI
>>>>>>>>> values. So:
>>>>>>>>>
>>>>>>>>> qt.fun <- function(x) qt(p=.975,df=length(x)-1)*sd(x)/ 
>>>>>>>>> sqrt(length(x))
>>>>>>>>> my.ci <- function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x))
>>>>>>> Minor improvement: mean(x) + qt.fun(x)*c(-1,1) but in general  
>>>>>>> confidence limits should be asymmetric (a la bootstrap).
>>>>>> Thanks,
>>>>>> if the date is normally distributed , symmetric confidence  
>>>>>> interval should be ok , right ?
>>>>> Yes; I do see a normal distribution about once every 10 years.
>>>> Is it not true that the students-T (qt(... and so on) confidence  
>>>> intervals is quite robust against non-normality too ?
>>>>
>>>> A teacher told me that, the students-T symmetric confidence  
>>>> intervals will give a adequate picture of the variability of the  
>>>> data in this particular case.
>>> Incorrect.  Try running some simulations on highly skewed data.   
>>> You will find situations where the confidence coverage is not very  
>>> close of the stated level (e.g., 0.95) and more situations where  
>>> the overall coverage is 0.95 because one tail area is near 0 and  
>>> the other is near 0.05.
>>>
>>> The larger the sample size, the more skewness has to be present to  
>>> cause this problem.
>> OK - I'm convinced. It turns out that the first change I made to  
>> sciplot
>> was to allow for asymmetric error bars. Is there an easy way (i.e.,
>> existing package) to bootstrap confidence intervals in R. If so, I'll
>> try to incorporate this as an option in sciplot.
>
> library(Hmisc)
> ?smean.cl.boot

H(arrel)misc :-)

Thanks for valuable input Frank.

This seems to work fine. (slightly more time consuming , but what do  
we have CPU power for )

library(Hmisc)
library(sciplot)
my.ci <- function(x) c(smean.cl.boot(x)[2],smean.cl.boot(x)[3])

lineplot 
.CI 
(V1 
,V2 
,data 
= 
d 
,col 
= 
c 
(4 
),err 
.col 
= 
c 
(1 
),err 
.width 
= 
0.02 
,legend=FALSE,xlab="Timeofday",ylab="IOPS",ci.fun=my.ci,cex=0.5,lwd=0.7)

Have I understood you correct in that this is a more accurate way of  
visualizing variability in any dataset , than the students T  
confidence intervals, because it does not assume normality  ?

Can you explain the meaning of B, and how to find a sensible value (if  
not the default is sufficient) ?

Best regards
Jarle Bjørgeengen