[R] FW: Bubble plots
Cody Hamilton
Cody_Hamilton at Edwards.com
Sun Aug 3 06:51:20 CEST 2008
Dear Hadley,
I tried to fit the line plot you suggested:
time<-c(rep('time 1',10),rep('time 2',10),rep('time 3',10))
y<-c('a','b','c','d','a','b','c','a','d','a','a','a','b','c','d',
'b','b','c','d','d','b','c','c','d','d','a','a','b','c','d')
D<-data.frame(cbind(y,time))
tab <- prop.table(table(D), margin = 2)
df <- as.data.frame(tab, responseName = "freq")
library(ggplot2)
qplot(time, freq, data=df, colour = y, geom="line", position="jitter")
However, I got the error message 'Error in do.call("gList", panels) : second argument must be a list.' Have I made a mistake?
By the way, is there a way to make the symbols on the plot the actual values of y? For instance, the frequency of level 'a' at time point 'time 1' could be represented by a red 'a' instead of by a red dot.
Regards,
-Cody
________________________________________
From: hadley wickham [h.wickham at gmail.com]
Sent: Saturday, August 02, 2008 6:24 AM
To: Frank E Harrell Jr
Cc: Cody Hamilton; r-help at r-project.org
Subject: Re: [R] FW: Bubble plots
On Sat, Aug 2, 2008 at 8:10 AM, Frank E Harrell Jr
<f.harrell at vanderbilt.edu> wrote:
> Cody Hamilton wrote:
>>
>> Is there a way to create a 'bubble plot' in R?
>>
>> For example, if we define the following data frame containing the level of
>> y observed for 5 patients at three time points:
>>
>> time<-c(rep('time 1',5),rep('time 2',5),rep('time 3',5))
>> y<-c('a','b','c','d','a','b','c','a','d','a','a','a','b','c','d')
>> D<-data.frame(cbind(y,time))
>>
>> I would like to display the percentage of subjects in each level of y at
>> each time point as a bubble whose size is proportional to the percentage of
>> subjects in the given level of y at the given time point. Thus, in the case
>> of the data frame above the plot would have the levels of y
>> ('a','b','c','d') on the y-axis and the levels of time ('time 1','time 2',
>> time 3') on the x-axis with four bubbles above each time point (e.g. the
>> size of the bubble in the bottom left corner of the plot would be
>> proportional to the percentage of patients with y='a' at time='time 1').
>>
>> I am running R 2.7.1 under windows.
>>
>> Regards,
>> -Cody
>>
>
> The xYplot function in the Hmisc package can do that. It may be more
> elegant using ggplot2.
It's certainly possible to do it with ggplot2:
tab <- prop.table(table(D), margin = 2)
df <- as.data.frame(tab, responseName = "freq")
library(ggplot2)
qplot(y, time, data = df, size = freq)
qplot(y, time, data = df, size = freq) + scale_area()
qplot(y, time, data = df, size = freq) + scale_area(to=c(1,5))
But it wouldn't recommend it - you're trying to visualise an important
number (frequency) using a perceptual mapping (size) that humans
aren't very good at. Why not do a scatterplot of frequency vs time?
qplot(time, freq, data=df, colour = y)
There are only a few different values of freq for this example, so a
little jittering helps:
qplot(time, freq, data=df, colour = y, geom="jitter")
Since you have time on the x-axis it's common to use a line plot:
df$time <- as.numeric(gsub("time ", "", df$time))
qplot(time, freq, data=df, colour = y, geom="line")
although again you have an overplotting problem, which you could solve
with jittering:
qplot(time, freq, data=df, colour = y, geom="line", position="jitter")
Hadley
--
http://had.co.nz/
This message contains information which may be confident...{{dropped:8}}
More information about the R-help
mailing list