[R] [External] challenging data merging/joining problem

Rasmus Liland jr@| @end|ng |rom po@teo@no
Mon Jul 6 18:15:00 CEST 2020


On 2020-07-06 12:03 +0300, Eric Berger wrote:
> On Mon, Jul 6, 2020 at 2:07 AM Richard M. Heiberger <rmh using temple.edu> wrote:
> > On Sun, Jul 5, 2020 at 2:51 PM Christopher W. Ryan <cryan using binghamton.edu> wrote:
> > >
> > > I've been conducting relatively simple 
> > > COVID-19 surveillance for our 
> > > jurisdiction. 
> >
> > Have you talked directly to the designers 
> > of the new database?
> 
> Hi Christopher,
> This seems pretty standard and 
> straightforward, unless I am missing 
> something. You can do the "full join" 
> without changing variable names.  Here's a 
> small code example with two tibbles, a and 
> b, where the column 'x' in a corresponds to 
> the column 'u' in b.
> 
> a <- tibble(x=1:15,y=21:35)
> b <- tibble(u=c(1:10,51:55),z=31:45)
> foo <- merge(a,b,by.x="x",by.y="u",all.x=TRUE,all.y=TRUE)

Perhaps something like

	new_names <-
	  c("dob"="birthdate",
	    "lastName"="last_name",
	    "firstName"="first_name")
	idx <- match(x=names(new_names),
	  table=colnames(dataSystemA))
	colnames(dataSystemA)[idx] <- new_names
	merge(
	  x=dataSystemA,
	  y=dataSystemB,
	  by=new_names,
	  all=TRUE)

which yields

	   birthdate  last_name first_name  onsetDate
	1 2010-10-11   LOVEGOOD       luna       <NA>
	2 2010-12-06   GRAINGER   hermione 2020-07-09
	3 2011-01-25 LONGBOTTOM    neville 2020-07-10
	4 2011-07-03     MALFOY      draco       <NA>
	5 2011-07-14    WEASLEY        ron 2020-07-08
	6 2011-10-04     POTTER      harry 2020-07-07
	7 2012-02-13    DIGGORY     cedric       <NA>
	  symptomatic date_of_onset symptoms_present
	1          NA    2020-07-12            FALSE
	2          NA    2020-07-09             TRUE
	3          NA    2020-07-10             TRUE
	4          NA    2020-07-11            FALSE
	5       FALSE          <NA>               NA
	6        TRUE          <NA>               NA
	7          NA    2020-07-13             TRUE

?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200706/b6602a2b/attachment.sig>


More information about the R-help mailing list