[R] reshape2's dcast() Adds NAs to Data Frame
Rich Shepard
rshepard at appl-ecosys.com
Wed Aug 8 20:33:39 CEST 2012
On Tue, 7 Aug 2012, R. Michael Weylandt wrote:
> Can you provide a reproducible example? See, e.g.,
Michael,
I think the attached 'sample.txt' and 'sample.cast.txt' should do. There
are no missing values in sample.txt but there are in the reshaped data
frame. The sequence of commands I used to generate these are:
> sample <- read.table('sample.txt', header = T, sep = ',')
> sample$sampdate <- as.Date(as.character(sample$sampdate))
> sample$ceneq1 <- as.logical(sample$ceneq1)
> str(sample)
'data.frame': 715 obs. of 8 variables:
$ site : Factor w/ 5 levels "D-1","D-2","D-3",..: 1 1 1 1 1 1 1 ...
$ sampdate: Date, format: "2007-12-12" "2007-12-12" ...
$ era : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1 1 1 1 ...
$ param : Factor w/ 54 levels "AgDis","AgTot",..: 2 4 5 7 10 13 21 ...
$ quant : num 1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 2.39e-02 ...
$ ceneq1 : logi TRUE FALSE FALSE FALSE TRUE FALSE ...
$ floor : num 0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
$ ceiling : num 1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 2.39e-02 ...
> sample.melt <- melt(sample, id.vars = c('site', 'sampdate', 'era', 'param', 'ceneq1', 'floor', 'ceiling'))
> sample.cast <- dcast(sample.melt, site + sampdate + era + ceneq1 + floor + ceiling ~ param)
> str(sample.cast)
'data.frame': 668 obs. of 60 variables:
$ site : Factor w/ 5 levels "D-1","D-2","D-3",..: 1 1 1 1 1 1 1 ...
$ sampdate: Date, format: "2007-12-12" "2007-12-12" ...
$ era : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1 1 1 1 ...
$ ceneq1 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ floor : num 0.00132 0.0113 0.0239 0.0253 0.0348 0.106 0.293 4.11 ...
$ ceiling : num 0.00132 0.0113 0.0239 0.0253 0.0348 0.106 0.293 4.11 ...
$ AgDis : num NA NA NA NA NA NA NA NA NA NA ...
$ AgTot : num NA NA NA NA NA NA NA NA NA NA ...
$ AlDis : num NA NA NA NA NA NA NA NA NA NA ...
$ AlTot : num NA NA NA NA NA 0.106 NA NA NA NA ...
etc.
> dput(sample, 'sample.txt')
> dput(sample.cast, 'sample.cast.txt')
The context for this is my learning how to use the NADA package to plot
and analyze left-censored data. The full data set has 64 site and param
levels. I don't know if I can use the base data frame, the reshaped (dcast)
data frame or individual subsets (one for each parameter).
Rich
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sample.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120808/5cb020e3/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sample.cast.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120808/5cb020e3/attachment-0001.txt>
More information about the R-help
mailing list