[R] Combine two dataframe with different row number and interpolation between values

javad bayat j@b@y@t194 @end|ng |rom gm@||@com
Wed Aug 31 08:08:46 CEST 2022


 Dear all,
I am trying to combine two large dataframe in order to make a dataframe
with exactly the dimension of the second dataframe.
The first df is as follows:

df1 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 2920), d =
rep(c(1:365,1:365,1:365,1:365,1:365),each=8),
      h = rep(c(seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by =
3),seq(3,24, by = 3),seq(3,24, by = 3)),365),
      ws = rnorm(1:14600, mean=20))
> head(df1)
     y       d   h        ws
1  2010  1  3     20.71488
2  2010  1  6     19.70125
3  2010  1  9     21.00180
4  2010  1 12     20.29236
5  2010  1 15     20.12317
6  2010  1 18     19.47782

The data in the "ws" column were measured with 3 hours frequency and I need
data with one hour frequency. I have made a second df as follows with one
hour frequency for the "ws" column.

df2 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 8760), d =
rep(c(1:365,1:365,1:365,1:365,1:365),each=24),
      h = rep(c(1:24,1:24,1:24,1:24,1:24),365), ws = "NA")
> head(df2)
      y      d    h   ws
1  2010  1    1   NA
2  2010  1    2   NA
3  2010  1    3   NA
4  2010  1    4   NA
5  2010  1    5   NA
6  2010  1    6   NA

What I am trying to do is combine these two dataframes so as to the rows in
df1 (based on the values of "y", "d", "h" columns) that have values exactly
similar to df2's rows copied in its place in the new df (df3).
For example, in the first dataframe the first row was measured at 3 o'clock
on the first day of 2010 and this row must be placed on the third row of
the second dataframe which has a similar value (2010, 1, 3). Like the below
table:
      y      d    h   ws
1  2010  1    1   NA
2  2010  1    2   NA
3  2010  1    3   20.71488
4  2010  1    4   NA
5  2010  1    5   NA
6  2010  1    6   19.70125

But regarding the values of the "ws" column for df2 that do not have value
(at 4 and 5 o'clock), I need to interpolate between the before and after
values to fill in the missing data of the "ws".
I have tried the following codes but they did not work correctly.

> df3 = merge(df1, df2, by = "y")
Error: cannot allocate vector of size 487.9 Mb
or
> library(dplyr)
> df3<- df1%>% full_join(df2)


Is there any way to do this?
Sincerely





-- 
Best Regards
Javad Bayat
M.Sc. Environment Engineering
Alternative Mail: bayat194 using yahoo.com

	[[alternative HTML version deleted]]



More information about the R-help mailing list