[R] Reshaping matrix of lists as dataframe
Henrique Dallazuanna
wwwhsd at gmail.com
Mon Feb 1 12:13:26 CET 2010
You can try this also:
m <- do.call(rbind, sapply(split(x, rep(seq(length(x)/3), each = 3)),
do.call, what = cbind))
dimnames(m) <- list(paste("Case", rep(1:3, unique(sapply(x, length))),
sep = ""), c("First", "Length", "Value"))
On Mon, Feb 1, 2010 at 5:58 AM, Oliver Gondring <olihui at gmx.de> wrote:
> Hello William, hello David,
>
> thanks a lot for helping and keeping me going on what sometimes seems to be
> a long way to R mastery! :)
>
> I found that the two solutions William proposed were in fact easier to
> understand for me at the moment as David's (and has the additional advantage
> of producing the desired data types ('numeric'/'integer') in the columns
> 2-5), however I think all of the code you provided will be extremely helpful
> to learn some new tricks by analyzing it in detail.
>
> For everyone concerned with similar data manipulation tasks, here's a short
> summary of the thread:
>
>>>> The original data (a matrix of _lists_, of cours - mea culpa - hence the
>>>> modified name of the thread):
>
> x <- list(c(1,2,4),c(1,3,5),c(0,1,0),
> c(1,3,6,5),c(3,4,4,4),c(0,1,0,1),
> c(3,7),c(1,2),c(0,1))
> data <- matrix(x,byrow=TRUE,nrow=3)
> colnames(data) <- c("First", "Length", "Value")
> rownames(data) <- c("Case1", "Case2", "Case3")
>
>> data
> First Length Value
> Case1 Numeric,3 Numeric,3 Numeric,3
> Case2 Numeric,4 Numeric,4 Numeric,4
> Case3 Numeric,2 Numeric,2 Numeric,2
>
>
>>>> The desired output (a dataframe of a database-like 'flat' structure):
>
>> Case Sequence First Length Value
>> 1 Case1 1 1 1 0
>> 2 Case1 2 2 3 1
>> 3 Case1 3 4 5 0
>> 4 Case2 1 1 3 0
>> 5 Case2 2 3 4 1
>> 6 Case2 3 6 4 0
>> 7 Case2 4 5 4 1
>> 8 Case3 1 3 1 0
>> 9 Case3 2 7 2 1
>
>
>>>> Ways to do it:
>
> (1)
>> lengths<-sapply(data[,1],length)
>> data.frame(Case=rep(rownames(data),lengths),
> Sequence=sequence(lengths), apply(data,2,unlist),
> row.names=NULL)
>
>> It assumes that sapply(data[,k],length) is the
>> same for all k in 1:ncol(data).
>
> Which is, as you inferred correctly from the given example dataset (because
> I forgot to mention explicitly), is always the case.
>
> (2)
>> data.frame(Case=rep(rownames(data),lengths),
> Sequence=sequence(lengths),
> lapply(split(data,colnames(data)[col(data)]), unlist),
> row.names=NULL)
>
> (3)
> (David's code with some additions to produce nearly the same output as (1)
> and (2))
> (however there's still one difference: columns 2-5 are 'factors')
>> result <- data.frame(do.call(rbind,
> sapply(rownames(data), function(.x) cbind(.x,
> # those were the rownames
> cbind(1:length(data[.x, "First"][[1]]),
> # and that was the incremental counter
> sapply(data[.x, ],
> # and finally the values which unfortunately get turned into
> characters
> function(.y) return(.y ) ) ) ) )))
>> colnames(result)[1:2] <- c("Case","Sequence")
>> result
>
> Cheers,
> Oliver
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
More information about the R-help
mailing list