[R] Maintaining repeated ID numbers when transposing with reshape

Adaikalavan Ramasamy a.ramasamy at imperial.ac.uk
Thu Aug 28 19:13:14 CEST 2008


Not the prettiest code but it returns what you want. Might be slow for 
large dataframes.

df <- data.frame( ID=c(1,1,1,1,2,2),
                   TEST=c("A","A","B","C","B","B"),
                   RESULT=c(17,12,15,12,8,9) )


big.out <- list(NULL)

for( uID in unique(df$ID) ){
  m <- df[ df$ID == uID, , drop=FALSE ]
  run.order <- unlist(sapply( table(m$TEST), function(x) if(x > 0) 1:x) )
  m <- cbind( m, run.order=run.order )

  nr <- max(run.order)
  out <- matrix( nr=nr, nc=nlevels(m$TEST),
                 dimnames=list( rep(uID, nr), levels(m$TEST) ))

  for(i in 1:nrow(m)) out[ m$run.order[i], m$TEST[i] ] <- m$RESULT[i]
  big.out[[uID]] <- out
}

do.call( "rbind", big.out )

    A  B  C
1 17 15 12
1 12 NA NA
2 NA  8 NA
2 NA  9 NA


Regards, Adai


jcarmichael wrote:
> Thank you for your suggestion, I will play around with it. I guess my concern
> is that I need each test result to occupy its own "cell" rather than have
> one or more in the same row.
> 
> 
> Adaikalavan Ramasamy-2 wrote:
>> There might be a more elegant way of doing this but here is a way of 
>> doing it without reshape().
>>
>>     df <- data.frame( ID=c(1,1,1,1,2,2),
>>                       TEST=c("A","A","B","C","B","B"),
>>                       RESULT=c(17,12,15,12,8,9) )
>>
>>     df.s <- split( df, df$ID )
>>
>>     out  <- sapply( df.s, function(m)
>>                     tapply( m$RESULT, m$TEST, paste, collapse="," ) )
>>
>>     t(out)
>>
>>       A       B     C
>>     1 "17,12" "15"  "12"
>>     2 NA      "8,9" NA
>>
>> Not the same output as you wanted. This makes more sense unless you have 
>> a reason to priotize 17 instead of 12 in the first row.
>>
>> Regards, Adai
>>
>>
>> jcarmichael wrote:
>>> I have a dataset in "long" format that looks something like this:
>>>
>>> ID   TEST    RESULT
>>> 1       A          17
>>> 1       A          12
>>> 1       B          15
>>> 1       C          12
>>> 2       B           8
>>> 2       B           9
>>>
>>> Now what I would like to do is transpose it like so:
>>>
>>> ID    TEST A    TEST B    TEST C
>>> 1         17           15          12
>>> 1         12            .            .
>>> 2          .             8            .
>>> 2          .             9            .
>>>
>>> When I try:
>>>
>>> reshape(mydata, v.names="result", idvar="id",timevar="test",
>>> direction="wide")
>>>
>>> It gives me only the first occurrence of each test for each subject.  How
>>> can I transpose my dataset in this way without losing information about
>>> repeated tests?
>>>
>>> Any help or guidance would be appreciated!  Thanks!
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>



More information about the R-help mailing list