[R] Combine two dataframe with different row number and interpolation between values
PIKAL Petr
petr@p|k@| @end|ng |rom prechez@@cz
Wed Aug 31 08:51:46 CEST 2022
Hallo
And missing value interpolation is rather tricky business dependent on what
is underlying process.
Maybe na.locf from zoo package?
Or approxfun?, splinefun?
Cheers
Petr
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of javad bayat
> Sent: Wednesday, August 31, 2022 8:09 AM
> To: r-help using r-project.org
> Subject: [R] Combine two dataframe with different row number and
> interpolation between values
>
> Dear all,
> I am trying to combine two large dataframe in order to make a dataframe
with
> exactly the dimension of the second dataframe.
> The first df is as follows:
>
> df1 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 2920), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=8),
> h = rep(c(seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by =
3),seq(3,24, by =
> 3),seq(3,24, by = 3)),365),
> ws = rnorm(1:14600, mean=20))
> > head(df1)
> y d h ws
> 1 2010 1 3 20.71488
> 2 2010 1 6 19.70125
> 3 2010 1 9 21.00180
> 4 2010 1 12 20.29236
> 5 2010 1 15 20.12317
> 6 2010 1 18 19.47782
>
> The data in the "ws" column were measured with 3 hours frequency and I
need
> data with one hour frequency. I have made a second df as follows with one
hour
> frequency for the "ws" column.
>
> df2 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 8760), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=24),
> h = rep(c(1:24,1:24,1:24,1:24,1:24),365), ws = "NA")
> > head(df2)
> y d h ws
> 1 2010 1 1 NA
> 2 2010 1 2 NA
> 3 2010 1 3 NA
> 4 2010 1 4 NA
> 5 2010 1 5 NA
> 6 2010 1 6 NA
>
> What I am trying to do is combine these two dataframes so as to the rows
in
> df1 (based on the values of "y", "d", "h" columns) that have values
exactly
> similar to df2's rows copied in its place in the new df (df3).
> For example, in the first dataframe the first row was measured at 3
o'clock on
> the first day of 2010 and this row must be placed on the third row of the
second
> dataframe which has a similar value (2010, 1, 3). Like the below
> table:
> y d h ws
> 1 2010 1 1 NA
> 2 2010 1 2 NA
> 3 2010 1 3 20.71488
> 4 2010 1 4 NA
> 5 2010 1 5 NA
> 6 2010 1 6 19.70125
>
> But regarding the values of the "ws" column for df2 that do not have value
(at 4
> and 5 o'clock), I need to interpolate between the before and after values
to fill in
> the missing data of the "ws".
> I have tried the following codes but they did not work correctly.
>
> > df3 = merge(df1, df2, by = "y")
> Error: cannot allocate vector of size 487.9 Mb or
> > library(dplyr)
> > df3<- df1%>% full_join(df2)
>
>
> Is there any way to do this?
> Sincerely
>
>
>
>
>
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: bayat194 using yahoo.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list