[R] Fwd: construct boxplots from data with varying column widths
David Winsemius
dwinsemius at comcast.net
Sat Jul 16 19:27:12 CEST 2011
From: David Winsemius <dwinsemius at comcast.net>
On Jul 16, 2011, at 12:15 PM, Rory Campbell-Lange wrote:
> On 16/07/11, David Winsemius (dwinsemius at comcast.net) wrote:
>>
>> On Jul 16, 2011, at 11:19 AM, Rory Campbell-Lange wrote:
>>
>>> I'm an R beginner, and I would like to construct a set of boxplots
>>> showing database function runtimes.
>
>>> I can easily reformat the base data to provide it to R in a format
>>> such as:
>>>
>>> function1,12.5
>>> function1,13.11
>>> function1,35.2
>>> ...
>
>> That is definitely to be preferred. Read that into R and show us the
>> results of str on your R data object.
>
> Thanks for your suggestion.
>
>> str(data2)
> 'data.frame': 1940170 obs. of 2 variables:
> $ function.: Factor w/ 127 levels "fn_activities01_list",..: 102
> 102 102 102 102 102 102 102 102 102 ...
> $ runtime : num 38.1 32.4 41.2 92.9 130.5 ..
>
>> head(data2)
> function. runtime
> 1 fn_slot03_byperson 38.083
> 2 fn_slot03_byperson 32.396
> 3 fn_slot03_byperson 41.246
> 4 fn_slot03_byperson 92.904
> 5 fn_slot03_byperson 130.512
> 6 fn_slot03_byperson 113.853
>
> tmp <- data2[data2$dbfunc=='fn_slot03_byperson',]
>> length(tmp$runtime)
> [1] 24004
>> ave(tmp$runtime)[1]
> [1] 41.8108
I would have guessed you would get an error, but maybe if ave() is
given no grouping factor it just returns a grand mean.
Try instead one of these:
aggregate(data2, data2$function. , FUN=mean)
tapply(data2$runtime, data2$function. , FUN=mean)
data2$grpmean <- ave( data2$runtime, data2$function. , FUN=mean)
The last one adds a column in the dataframe and could be useful for
identifying items that are some particular diastance away from thier
group mean.
--
David Winsemius, MD
West Hartford, CT
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list