[R] Display time of PDF plots
MacQueen, Don
m@cqueen1 @end|ng |rom ||n|@gov
Wed Sep 5 22:05:02 CEST 2018
(this is somewhat a change of subject from the original question)
Rich, there functions such as aggregate() in base R. There are also many options in CRAN packages.
But I tend to have difficulty getting them to do exactly what I want, and usually end up rolling my own.
The idea is to split the data into groups by station and month, then calculate summary stats for each group, then recombine into a new data frame.
## untested with your data, but this kind of approach works well for me
## note that this code assumes easting, northing, and elevation are in fact unique within each group
## if they are not, you will get an ERROR
## add a 'month' variable
raindf <- rainfall
raindf$mon <- format(raindf$sampdate,'%Y-%m')
mysum <- function(df) {
data.frame( name=unique(df$name),
easting=unique(df$easting),
northing=unique(df$northing),
elev=unique(df$elev),
mon=unique(df$mon),
pr.med=median(df$prcp),
pr.max=max(df$prcp) )
}
tmpdf <- split(raindf, paste(raindf$name, raindf$mon) )
## at this point, you can check your summary stats function with, for example,
mysum(tmpdf[[1]])
mysum(tmpdf[[2]])
## when satisfied with mysum(), do this
tmpsum <- lapply(tmpdf, mysum)
## recombine
rain.by.mon <- do.call(rbind, tmpsum)
## might still want to create a numeric month to facilitate plotting
## or maybe assign each month to the first of the month, or the 15th, or end or whatever makes sense
rain.by.mon$mondt <- as.Date(paste0(rain.by.mon$mon,'-1'))
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
On 9/4/18, 9:41 AM, "R-help on behalf of Rich Shepard" <r-help-bounces using r-project.org on behalf of rshepard using appl-ecosys.com> wrote:
On Mon, 3 Sep 2018, Rich Shepard wrote:
> Is there a process by which these plots can be 'thinned' so they show the
> same overall patterns but with fewer points so they display more quickly?
Bert/Paul/David/John:
Thanks very much for the suggestions. I think an appropriate way to
illustrate the patterns is to plot the median and maximum for each month
(for all sites). That's the important information and plotting each daily
point over 13 years obscures that information.
The dataframe is structured this way:
str(rainfall)
'data.frame': 113569 obs. of 6 variables:
$ name : chr "Headworks Portland Water" "Headworks Portland Water" "Headworks Portland Water" "Headworks Portland Water" ...
$ easting : num 2370575 2370575 2370575 2370575 2370575 ...
$ northing: num 199338 199338 199338 199338 199338 ...
$ elev : num 228 228 228 228 228 228 228 228 228 228 ...
$ sampdate: Date, format: "2005-01-01" "2005-01-02" ...
$ prcp : num 0.59 0.08 0.1 0 0 0.02 0.05 0.1 0 0.02 ...
There are probably multiple ways of extracting the monthly median and
maximum 'prcp' and I don't know how to identify the appropriate one. Is
there a task view for this type of data manipulation? I've not before done
anything like this and would appreciate a pointer to where I start to learn.
Regards,
Rich
______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list