[R] Help with transpose please.
Daniel Nordlund
djnord|und @end|ng |rom gm@||@com
Sat Jun 23 12:04:26 CEST 2018
On 6/22/2018 4:43 AM, Bill Poling wrote:
> Good morning.
>
>
> I have data in the form:
>
> head(Edit041IA, n=25)
> ClaimServiceID ClaimID DiagnosisCode
> 1 183056004 78044473 C562
> 2 183056004 78044473 C778
> 3 183056004 78044473 C784
> 4 183056004 78044473 C786
> 5 183056004 78044473 C7961
> 6 183056004 78044473 C7982
> 7 183056004 78044473 C7989
> 8 183056008 78044473 C562
> 9 183056008 78044473 C778
> 10 183056008 78044473 C784
> 11 183056008 78044473 C786
> 12 183056008 78044473 C7961
> 13 183056008 78044473 C7982
> 14 183056008 78044473 C7989
> 15 183139945 78078925 M79606
> 16 183139945 78078925 M7989
> 17 183139945 78078925 R600
> 18 183236728 78119632 H02831
> 19 183236728 78119632 H02832
> 20 183236728 78119632 H02834
> 21 183236728 78119632 H02835
> 22 183236728 78119632 H04123
> 23 183236728 78119632 Z411
> 24 183236728 78119632 H2513
> 25 183236728 78119632 H43813
>
> And wish to transpose to single record for single claimServiceID, ClaimID, and Dx1,Dx2,Dx3, etc:
>
> There can be multiple claimServiceIDs for a ClaimID so I want the unique ClaimServiceID as the identifier when I join this data back into a longer single record length file by that column.
>
> claimServiceID ClaimID Dx1 Dx2 Dx3 ...etc
> 1 183056004 78044473 C562 C778 C784 C786 C7961 ...etc
> 2 183056008 78044473 C562 C778 C784 C786 C7961 ...etc
>
>
> (If you would prefer the complete dput of the 1272 records I will gladly provide .Names = c("ClaimServiceID",
>
> "ClaimID", "DiagnosisCode"), class = "data.frame", row.names = c(NA,
>
> -1272L))
>
>
>
> At the moment the classes are:
>
> classes <- as.character(sapply(Edit041IA, class))
>
> classes
>
> # [1] "integer" "integer" "character" <---but do not have to be if that helps its just that's how the csv load pulled them in
>
> The max number of columns based on this transpose of the DiagnosisCode column (in this dataset) is 12 if that is important to know.
>
> I have looked at a variety of webpages and cannot get this right,
>
> dta2 <- melt(dta1, id=c("ClaimServiceID","ClaimID"))
> View(dta2)
> # https://www.r-bloggers.com/pivot-tables-in-r/
>
> # https://stackoverflow.com/questions/18449938/pivot-on-data-table-similar-to-rehape-melt-function
>
>
> dta3 <- cast(Edit041IA, ClaimServiceID ~ DiagnosisCode, ClaimID)
> View(dta3)
> dta3 <- cast(Edit041IA, DiagnosisCode ~ ClaimServiceID, ClaimID)
> View(dta3)
>
> dta3 <- melt(Edit041IA, id=c("ClaimServiceID"))
> View(dta3)
>
> dta3 <- aggregate(Edit041IA, by=list(ClaimServiceID, ClaimID, DiagnosisCode))
> View(dta3)
>
>
> dta3 <- aggregate(Edit041IA, by=list(ClaimServiceID))
> View(dta3)
> # https://www.r-statistics.com/tag/transpose/
>
> dta3 <- aggregate(Edit041IA, by=list(DiagnosisCode, ClaimServiceID, ClaimID))
> View(dta3)
>
>
> I am sure it's a basic, simple procedure, but I am pressed for time on this one, any support will be greatly appreciated, thank you.
>
> WHP
>
>
>
>
>
> Confidentiality Notice This message is sent from Zelis. ...{{dropped:15}}
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Bill,
you have received some good suggestions and since you are pressed for
time this may be too late. However, here is a solution using ave()
function and cast() from the reshape package.
# create diagnosis variable names
dxnames <- paste('Dx',ave(rep(1, nrow(have)), have[,1:2], FUN =
seq_along), sep='')
# cast the data into wide format
cast(cbind(have,dxnames), ClaimServiceID + ClaimID ~ dxnames,
value='DiagnosisCode')
Hope this is helpful,
Dan
--
Daniel Nordlund
Port Townsend, WA USA
More information about the R-help
mailing list