[R] Help with recast() syntax
David Winsemius
dwinsemius at comcast.net
Tue Nov 29 07:25:30 CET 2011
On Nov 29, 2011, at 12:32 AM, Chris Conner wrote:
> Dear Help-Rs,
>
> I have data similar to the following:
>
> DF <- structure(list(X = 1:22, RESULT = structure(c(2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
This section of the structure has two NEG's for 201109 and none for POS.
> 1L, 1L), .Label = c("NEG", "POS"), class = "factor"), YR_MO =
> c(201011L,
> 201012L, 201101L, 201102L, 201103L, 201104L, 201105L, 201106L,
> 201107L, 201108L, 201109L, 201011L, 201012L, 201101L, 201102L,
> 201103L, 201104L, 201105L, 201106L, 201107L, 201108L, 201109L
> ), TOT_TESTS = c(66L, 98L, 109L, 122L, 113L, 111L, 113L, 146L,
> 124L, 130L, 120L, 349L, 393L, 376L, 371L, 396L, 367L, 406L, 383L,
> 394L, 412L, 379L)), .Names = c("X", "RESULT", "YR_MO", "TOT_TESTS"
> ), class = "data.frame", row.names = c(NA, -22L))
>
> Currently there are 2 observations for each month (one for negative
> and one for positive test results). What I need to create a data
> set that looks like the following, with positive and negative test
> results in the same row organized by month:
After fixing the POS/NEG discrepancy, this works:
> dcast(DF, YR_MO ~ RESULT, value_var="TOT_TESTS")
YR_MO NEG POS
1 201011 349 66
2 201012 393 98
3 201101 376 109
4 201102 371 122
5 201103 396 113
6 201104 367 111
7 201105 406 113
8 201106 383 146
9 201107 394 124
10 201108 412 130
11 201109 379 120
--
David.
>
> DF2<-structure(list(X = 1:11, RESULT = structure(c(1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "POS", class = "factor"),
> YR_MO = c(201011L, 201012L, 201101L, 201102L, 201103L, 201104L,
> 201105L, 201106L, 201107L, 201108L, 201109L), POS_TESTS = c(66L,
> 98L, 109L, 122L, 113L, 111L, 113L, 146L, 124L, 130L, 120L
> ), NEG_TESTS = c(349L, 393L, 376L, 371L, 396L, 367L, 406L,
> 383L, 394L, 412L, 379L)), .Names = c("X", "RESULT", "YR_MO",
> "POS_TESTS", "NEG_TESTS"), class = "data.frame", row.names = c(NA,
> -11L))
>
> As this is something that I understand Hadley Wickham's Reshape
> package is ideally suited for, I tried using the following reshape
> command:
>
> ReshapeDF <- recast(DF, YR_MO~variable)
>
> I get the following error message:
>
> Using RESULT as id variables
> Error: Casting formula contains variables not found in molten data:
> YR_MO
>
> I have a work around that allows me to get to my desired endpoint
> that involves splitting the data.frame into two (by test result),
> then using the YR_MO as the by.x/by.y in a merge, but I think this
> task would be handled more efficiently using reshape? Can anyone
> help me to see where I'm going wrong? Thanks in advance!
>
> [[alternative HTML version deleted]]
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list