[R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display
Chel Hee Lee
chl948 at mail.usask.ca
Thu Dec 18 20:43:07 CET 2014
I like the approach presented by Jeff Newmiller as shown in the previous
post (I really like his way). As he suggested, it would be good to
start with 'factor' since you have all values of 'Primary.Viol.Type'.
You may try to use 'split()' function for creating table that you wish
to build. Please see the below (I hope this helps):
> PViol.Type.Per.Case.Original$Primary.Viol.Type <-
factor(Primary.Viol.Type, levels=PViol.Type, labels=PViol.Type)
>
> tmp <- split(PViol.Type.Per.Case.Original,
PViol.Type.Per.Case.Original$CaseID)
> ans <- ifelse(do.call(rbind, lapply(tmp, function(x)
table(x$Primary.Viol.Type))), 1, NA)
> ans
CaseID BW.BackWages LD.Liquid_Damages MW.Minimum_Wage OT.Overtime
1005317 NA NA NA NA NA
1007183 NA NA NA NA 1
1008833 NA NA NA NA 1
1012281 NA NA NA NA NA
1015285 NA NA NA NA NA
1015315 NA NA NA NA 1
1015322 NA NA NA NA NA
RK.Records_FLSA V.Poster_Other AS.Age BW.WHMIS_BackWages HS.Hours
1005317 NA NA NA NA 1
1007183 NA NA NA NA NA
1008833 NA NA NA NA NA
1012281 NA NA NA NA 1
1015285 NA 1 1 NA 1
1015315 NA NA NA NA NA
1015322 NA 1 NA NA NA
OA.HazOccupationAg ON.HazOccupationNonAg R3.Reg3AgeOccupation
1005317 NA NA NA
1007183 NA NA NA
1008833 NA NA NA
1012281 NA NA NA
1015285 NA NA NA
1015315 NA NA NA
1015322 NA NA NA
RK.Records_CL V.Other
1005317 NA NA
1007183 NA NA
1008833 NA NA
1012281 NA NA
1015285 1 NA
1015315 NA NA
1015322 NA NA
>
Chel Hee Lee
On 12/18/2014 10:02 AM, Jeff Newmiller wrote:
> No guarantees on "best"... but one way using base R could be:
>
> # Note that "CaseID" is actually not a valid PViol.Type as you had it
> PViol.Type <- c( "BW.BackWages"
> , "LD.Liquid_Damages"
> , "MW.Minimum_Wage"
> , "OT.Overtime"
> , "RK.Records_FLSA"
> , "V.Poster_Other"
> , "AS.Age"
> , "BW.WHMIS_BackWages"
> , "HS.Hours"
> , "OA.HazOccupationAg"
> , "ON.HazOccupationNonAg"
> , "R3.Reg3AgeOccupation"
> , "RK.Records_CL"
> , "V.Other" )
>
> # explicitly specifying all levels to the factor insures a complete
> # set of column outputs regardless of what is in the input
> PViol.Type.Per.Case.Original <-
> data.frame( CaseID
> , Primary.Viol.Type=factor( Primary.Viol.Type
> , levels=PViol.Type ) )
>
> tmp <- table( PViol.Type.Per.Case.Original )
> ans <- data.frame( CaseID=rownames( tmp )
> , as.data.frame( ifelse( 0==tmp, NA, 1 ) )
> )
>
>
> On Wed, 17 Dec 2014, bcrombie wrote:
>
>> # I have a dataframe that contains 2 columns:
>> CaseID <- c('1015285',
>> '1005317',
>> '1012281',
>> '1015285',
>> '1015285',
>> '1007183',
>> '1008833',
>> '1015315',
>> '1015322',
>> '1015285')
>>
>> Primary.Viol.Type <- c('AS.Age',
>> 'HS.Hours',
>> 'HS.Hours',
>> 'HS.Hours',
>> 'RK.Records_CL',
>> 'OT.Overtime',
>> 'OT.Overtime',
>> 'OT.Overtime',
>> 'V.Poster_Other',
>> 'V.Poster_Other')
>>
>> PViol.Type.Per.Case.Original <- data.frame(CaseID,Primary.Viol.Type)
>>
>> # CaseID?s can be repeated because there can be up to 14
>> Primary.Viol.Type?s
>> per CaseID.
>>
>> # I want to transform this dataframe into one that has 15 columns,
>> where the
>> first column is CaseID, and the rest are the 14 primary viol. types. The
>> CaseID column will contain a list of the unique CaseID?s (no
>> replicates) and
>> for each of their rows, there will be a ?1? under a column
>> corresponding to
>> a primary violation type recorded for that CaseID. So, technically,
>> there
>> could be zero to 14 ?1?s? in a CaseID?s row.
>>
>> # For example, the row for CaseID '1015285' above would have a ?1? under
>> ?AS.Age?, ?HS.Hours?, ?RK.Records_CL?, and ?V.Poster_Other?, but have
>> "NA"
>> under the rest of the columns.
>>
>> PViol.Type <- c("CaseID",
>> "BW.BackWages",
>> "LD.Liquid_Damages",
>> "MW.Minimum_Wage",
>> "OT.Overtime",
>> "RK.Records_FLSA",
>> "V.Poster_Other",
>> "AS.Age",
>> "BW.WHMIS_BackWages",
>> "HS.Hours",
>> "OA.HazOccupationAg",
>> "ON.HazOccupationNonAg",
>> "R3.Reg3AgeOccupation",
>> "RK.Records_CL",
>> "V.Other")
>>
>> PViol.Type.Columns <- t(data.frame(PViol.Type)
>>
>> # What is the best way to do this in R?
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Make-2nd-col-of-2-col-df-into-header-row-of-same-df-then-adjust-col1-data-display-tp4700878.html
>>
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list