[R] Need fresh eyes to see what I'm missing

Eric Berger er|cjberger @end|ng |rom gm@||@com
Tue Sep 14 17:29:50 CEST 2021


Before you create vel_by_month you can check vel for NAs and NaNs by

sum(is.na(vel))
sum(unlist(lapply(vel,is.nan)))

HTH,
Eric


On Tue, Sep 14, 2021 at 6:21 PM Rich Shepard <rshepard using appl-ecosys.com>
wrote:

> The data file begins this way:
> year,month,day,hour,min,fps
> 2016,03,03,12,00,1.74
> 2016,03,03,12,10,1.75
> 2016,03,03,12,20,1.76
> 2016,03,03,12,30,1.81
> 2016,03,03,12,40,1.79
> 2016,03,03,12,50,1.75
> 2016,03,03,13,00,1.78
> 2016,03,03,13,10,1.81
>
> The script to process it:
> library('tidyverse')
> vel <- read.csv('../data/water/vel.dat', header = TRUE, sep = ',',
> stringsAsFactors = FALSE)
> vel$year <- as.integer(vel$year)
> vel$month <- as.integer(vel$month)
> vel$day <- as.integer(vel$day)
> vel$hour <- as.integer(vel$hour)
> vel$min <- as.integer(vel$min)
> vel$fps <- as.double(vel$fps, length = 6)
>
> # use dplyr to filter() by year, month, day; summarize() to get monthly
> # means
> vel_by_month = vel %>%
>      group_by(year, month) %>%
>      summarize(flow = mean(fps, na.rm = TRUE))
>
> R's display after running the script:
> > source('vel.R')
> `summarise()` has grouped output by 'year'. You can override using the
> `.groups` argument.
> Warning messages:
> 1: In eval(ei, envir) : NAs introduced by coercion
> 2: In eval(ei, envir) : NAs introduced by coercion
> 3: In eval(ei, envir) : NAs introduced by coercion
>
> The dataframe created by the read.csv() command:
> > head(vel)
>    year month day hour min  fps
> 1 2016     3   3   12   0 1.74
> 2 2016     3   3   12  10 1.75
> 3 2016     3   3   12  20 1.76
> 4 2016     3   3   12  30 1.81
> 5 2016     3   3   12  40 1.79
> 6 2016     3   3   12  50 1.75
>
> and the resulting grouping:
> > vel_by_month
> # A tibble: 67 × 3
> # Groups:   year [8]
>      year month   flow
>     <int> <int>  <dbl>
>   1     0    NA NaN
>   2  2016     3   2.40
>   3  2016     4   3.00
>   4  2016     5   2.86
>   5  2016     6   2.51
>   6  2016     7   2.18
>   7  2016     8   1.89
>   8  2016     9   1.38
>   9  2016    10   1.73
> 10  2016    11   2.01
> # … with 57 more rows
>
> I cannot find why line 1 is there. Other data sets don't produce this
> result.
>
> TIA,
>
> Rich
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list