[R] combined plot of observed and expected fractions
Gerrit Draisma
gdraisma at xs4all.nl
Tue Nov 30 17:42:34 CET 2010
Dear R-users,
I have a dataset of numbers of cases with a certain age
and stage.
I want to plot the observed stage distribution by age
and compare it to an expected or predicted one.
As I want to plot these for several populations
I want to use lattice plots.
Below is example code I made that produces a plot
that satisfies me. I prefer this to using stacked bars.
But I would like to improve on the code.
Specifically:
Question 1: How to get cumulative fractions from
observed numbers?
Question 2: How to divide each number by the
total number in each age group?
Question 3: How to construct an indicator
for combined categories (here:
observed/expected x Stage)
Question 4: Lattice: could I use data values
to locate the text?
I would appreciate any comments.
Thank you,
Gerrit.
=====
# stagexage.r
# compare observed and predicted stage by age.
library(lattice)
# simple data set
df<-data.frame(Age=rep(1:5,each=3),Stage=1:3)
# expectation
df$mu<-5+(df$Age-1)*df$Stage
# simulated data
df$N<-rpois(15,df$mu)
# for lattice xyplot
df<-reshape(df,direction="long",varying=c("N","mu"),timevar="OE",v.name="N")
# cumulate numbers over Age group and OE (Observed, Expected)
# Q1: how to aply cumsum by age and stage?
i<-df$Age+5*(df$OE-1)
P<-unlist(tapply(df$N,i,cumsum))
# compute probabilities
# Q2: How to divide by the total number by age and stage?
j<-3*(0:29%/%3) + 3
P<-P/P[j]
# combining OE and Stage in a single index
# for superposing graphs in one plot
# Q3: How to create an index from Stage and number (obs or exp)?
ix<-(df$Stage-1)*2+df$OE
# plot observed and predicted fractions in a "area" plot
j<-df$Stage!=3
xyplot(P[j]~Age,groups=ix[j],data=df[j,],type="o",
panel=function(x,y,...){
panel.xyplot(x,y,...)
# Q4: would it be possible to position labels based
# data values?
panel.text(x=3,y=c(0.15,0.5,0.85),labels=paste("Stage", 1:3))
},
scales=list(y=list(at=0:4/4)),
axs="i",ylim=c(0,1),
xlab="Age", ylab="Fraction",
par.settings=list(superpose.line=list(col=c("blue","red"),lty=1:2),
superpose.symbol=list(col=c("blue","red"),type=1)),
auto.key=list(text=c("Obs","Pred"),
points=F,lines=T,type="o",divide=1,columns=2)
)
=====
--
Gerrit Draisma
Department of Public Health
Erasmus MC, University Medical Center Rotterdam
Room AE-235
P.O. Box 2040 3000 CA Rotterdam The Netherlands
Phone: +31 10 7043787 Fax: +31 10 7038474
http://mgzlx4.erasmusmc.nl/pwp/?gdraisma
More information about the R-help
mailing list