[R] Display time of PDF plots

MacQueen, Don m@cqueen1 @end|ng |rom ||n|@gov
Wed Sep 5 22:05:02 CEST 2018


(this is somewhat a change of subject from the original question)

Rich, there functions such as aggregate() in base R. There are also many options in CRAN packages.

But I tend to have difficulty getting them to do exactly what I want, and usually end up rolling my own.

The idea is to split the data into groups by station and month, then calculate summary stats for each group, then recombine into a new data frame.

## untested with your data, but this kind of approach works well for me
## note that this code assumes easting, northing, and elevation are in fact unique within each group
## if they are not, you will get an ERROR

## add a 'month' variable
raindf <- rainfall
raindf$mon <- format(raindf$sampdate,'%Y-%m')
  
  mysum <- function(df) {
    data.frame( name=unique(df$name),
               easting=unique(df$easting),
               northing=unique(df$northing),
               elev=unique(df$elev),
               mon=unique(df$mon),
               pr.med=median(df$prcp),
               pr.max=max(df$prcp) )
  }

tmpdf <- split(raindf, paste(raindf$name, raindf$mon) )

## at this point, you can check your summary stats function with, for example,
mysum(tmpdf[[1]])
mysum(tmpdf[[2]])

## when satisfied with mysum(), do this
tmpsum <- lapply(tmpdf, mysum)

## recombine
rain.by.mon <- do.call(rbind, tmpsum)

## might still want to create a numeric month to facilitate plotting
## or maybe assign each month to the first of the month, or the 15th, or end or whatever makes sense
rain.by.mon$mondt <- as.Date(paste0(rain.by.mon$mon,'-1'))




--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 9/4/18, 9:41 AM, "R-help on behalf of Rich Shepard" <r-help-bounces using r-project.org on behalf of rshepard using appl-ecosys.com> wrote:

    On Mon, 3 Sep 2018, Rich Shepard wrote:
    
    > Is there a process by which these plots can be 'thinned' so they show the
    > same overall patterns but with fewer points so they display more quickly?
    
    Bert/Paul/David/John:
    
       Thanks very much for the suggestions. I think an appropriate way to
    illustrate the patterns is to plot the median and maximum for each month
    (for all sites). That's the important information and plotting each daily
    point over 13 years obscures that information.
    
       The dataframe is structured this way:
    
    str(rainfall)
    'data.frame':	113569 obs. of  6 variables:
      $ name    : chr  "Headworks Portland Water" "Headworks Portland Water" "Headworks Portland Water" "Headworks Portland Water" ...
      $ easting : num  2370575 2370575 2370575 2370575 2370575 ...
      $ northing: num  199338 199338 199338 199338 199338 ...
      $ elev    : num  228 228 228 228 228 228 228 228 228 228 ...
      $ sampdate: Date, format: "2005-01-01" "2005-01-02" ...
      $ prcp    : num  0.59 0.08 0.1 0 0 0.02 0.05 0.1 0 0.02 ...
    
       There are probably multiple ways of extracting the monthly median and
    maximum 'prcp' and I don't know how to identify the appropriate one. Is
    there a task view for this type of data manipulation? I've not before done
    anything like this and would appreciate a pointer to where I start to learn.
    
    Regards,
    
    Rich
    
    ______________________________________________
    R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    



More information about the R-help mailing list