[R] Data manipulation problem
Bert Gunter
gunter.berton at gene.com
Mon Apr 5 20:59:36 CEST 2010
You have tempted, and being weak, I yield to temptation:
"Any good ideas?"
Yes. Don't do this.
(what you probably really want to do is fit a model with age as a factor,
which can be done statistically e.g. by logistic regression; or graphically
using conditioning plots, e.g. via trellis graphics (the lattice package).
This avoids the arbitrariness and discontinuities of binning by age range.)
Bert Gunter
Genentech Nonclinical Biostatistics
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of moleps
Sent: Monday, April 05, 2010 11:46 AM
To: r-help at r-project.org
Subject: [R] Data manipulation problem
Dear R´ers.
I´ve got a dataset with age and year of diagnosis. In order to
age-standardize the incidence I need to transform the data into a matrix
with age-groups (divided in 5 or 10 years) along one axis and year divided
into 5 years along the other axis. Each cell should contain the number of
cases for that age group and for that period.
I.e.
My data format now is
ID-age (to one decimal)-year(yearly data).
What I´d like is
age 1960-1965 1966-1970 etc...
0-5 3 8 10 15
6-10 2 5 8 13
etc..
Any good ideas?
Regards,
M
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list