William Dunlap
wdunlap at tibco.com
Mon Mar 23 16:59:04 CET 2015
My stock advice is to add code that checks that the
stuff you read from the file has the expected format.
Do this right after the read.csv.
The stopifnot function gives a quick way to code the checks and it
helps to write helper functions when checks must be repeated. E.g.
allNonNegIntegers <- function(x) is.numeric(x) && all(!is.na(x) & x >=
0 & x%%1==0)
stopifnot(
all(c("Weight", "LC09", "LC10") %in% names(case_weights)),
is.numeric(case_weights$Weight),
allNonNegIntegers(case_weights$LC09),
allNonNegIntegers(case_weights$LC10),
<... more comma-separated check expressions>)
On Sun, Mar 22, 2015 at 6:21 PM, memilanuk <memilanuk at gmail.com> wrote:
> So... wrote my first script, rather than just using the interactive
> console. I think I got everything working more or less the way I want, but
> I'm sure there's a ton of room for improvement. Specifically in the way of
> automation - but thats where I kind of ran out of steam. Any suggestions
> would be much appreciated.
>
> case_data.r
>
> # Import CSV file into a data frame.
> case_weights <- read.csv(file = "case_weights.csv")
>
> # For each row, take the number in the Weight column and replicate it
> # as many times as there are in each count column.
> LC09 <- rep(case_weights$Weight, case_weights$LC09)
> LC10 <- rep(case_weights$Weight, case_weights$LC10)
> LP14b1 <- rep(case_weights$Weight, case_weights$LP14b1)
> LP14b2 <- rep(case_weights$Weight, case_weights$LP14b2)
>
> # Determine the longest vector, to help with the next step.
> max.len <- max(length(LC09), length(LC10), length(LP14b1),
> length(LP14b2))
>
> # Pad each vector with NA so they are all the same length and will
> # go in a data frame.
> LC09 <- c(LC09, rep(NA, max.len - length(LC09)))
> LC10 <- c(LC10, rep(NA, max.len - length(LC10)))
> LP14b1 <- c(LP14b1, rep(NA, max.len - length(LP14b1)))
> LP14b2 <- c(LP14b2, rep(NA, max.len - length(LP14b2)))
>
> # Stick everything back into one data frame.
> case_dat <- data.frame(LC09)
> case_dat$LC10 <- LC10
> case_dat$LP14b1 <- LP14b1
> case_dat$LP14b2 <- LP14b2
>
> # Stuff said data frame back into a CSV for use elsewhere (plot.ly).
> write.csv(case_dat, file = "expanded_case_weights.csv")
>
> # Boxplot it
> boxplot(case_dat, varwidth = TRUE, notch = TRUE, horizontal = TRUE,
> main = "Case Weights", xlab = "Weight (grains)",
> ylab = "Batch", las = 1, names = c("LC09", "LC10", "LP14b1",
> "LP14b2"))
>
>
>
