[R] more complex by with data.table???

Ista Zahn istazahn at gmail.com
Wed Jun 10 02:17:19 CEST 2015


Hi Ramiro,

There is a demonstration of this on the data.table wiki at
https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html.
You can do

dt[, lapply(.SD, mean), by=name]

or

dt[, as.list(colMeans(.SD)), by=name]

BTW, there are pretty straightforward ways to do this in base R as well, e.g,

data.frame(t(sapply(split(df[-1], df$name), colMeans)))

Best,
Ista

On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes
<ramiro at precisionbioassay.com> wrote:
> Hello,
>
> I am trying to do something that I am able to do with the "by" function within data.frame but can't figure out how to achieve with data.table.
>
> Consider
>
> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> I am aware that I can do something like:
>
> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>
> but how could I do the equivalent of:
>
> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> columnNames <- c("var1","var2","var3")
> result <- by(df, df$name, function(x) {
>    output <- c()
>    for(col in columnNames) {
>      output[col] <- myFunction(x[,col])
>    }
>   output
> })
> do.call(rbind,result)
>
> Thanks in advance,
> Ramiro
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list