[R] transformation of data.frame
Petr PIKAL
petr.pikal at precheza.cz
Mon Jul 12 14:09:00 CEST 2010
Hi
Assa Yeroslaviz <frymor at gmail.com> napsal dne 09.07.2010 13:25:43:
> Hello Petr,
>
> sorry for the mixed up. your example works perfectly fine.
>
> The one from Søren has shown the mentioned error. But even after
reading the
> columns as character
>
> > go <- read.table("go.txt", header= TRUE, colClasses = c("character",
"character"))
> or
> > go <- read.table("go.txt", header= TRUE, as.is = 1)
>
>
> it didn't solve the problem.
> the command:
> gmt <- lapplyBy(~GO, data = go, FUN = function(uu) {as.list(uu$GO[1],
paste(uu
> $gen, collapse = " "))})
>
> tries to convert my first column into integers and thand add 'NA's.
>
> What I don't understand is why.
> Does lapplyBy can work only with integers?
I do not use doBy library so I cannot give you definite explanation. I do
not believe that lapplyBy works with integers only. It says that it is a
formula version of lapply.
This is what you did
gmt <- lapplyBy(~GO, data = go, FUN = function(uu) {as.list(uu$GO[1],
paste(uu$gen, collapse = " "))})
and this is what Søren advised
aa<-lapplyBy(~ID, data=ddd, FUN=function(uu){list(uu$ID[1], paste(uu$gen,
collapse=":"))})
so maybe
gmt <- lapplyBy(~GO, data = go, FUN = function(uu) {list(uu$GO[1],
paste(uu$gen, collapse = " "))})
Gives you desired result.
Regards
Petr
>
> THX,
>
> Assa
>
> 2010/7/8 Petr PIKAL <petr.pikal at precheza.cz>
> Hi
>
> r-help-bounces at r-project.org napsal dne 08.07.2010 12:02:45:
>
> > I don't understand it. When I'm doing this example it wirks fine, but
> when
> > I'm adding the "GO:" to the beginning of the first column (as to see
in
> my
> > wanted result table:
> > GO0042787
> > GO0016070
> > GO0016070
> >
> > I'm getting a list of warning:
> > Warning messages:
> > 1: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> > 2: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> > ...
> > 9: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> > 10: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> Not sure what is wrong, it seems to me that your ID become factor.
>
> Having your data in dataframe test as character columns
>
> see ?str
>
> test.ag<-aggregate(test$X.gen, list(test$ID), function(x) paste(x,
> collapse=":"))
>
> I can make aggregated data frame
>
> paste("GO",test.ag[,1], sep="")
> [1] "GO0006417" "GO0006511" "GO0007409" "GO0016070" "GO0042787"
>
> and it is strightforward to add GO at the beginning.
>
> I leave how to add this result to your aggregated data frame as an
> exercise.
>
> Regards
> Petr
>
>
> >
> > What did I do wrong here?
> >
> > Assa
> >
> > On Thu, Jul 8, 2010 at 11:09, Søren Højsgaard
> <Soren.Hojsgaard at agrsci.dk>wrote:
> >
> > > Like this?
> > >
> > > > library(doBy)
> > > > (ddd <- read.table("foo.txt",header=T))
> > > ID gen
> > > 1 42787 gen2
> > > 2 16070 gen2
> > > 3 16070 gen3
> > > 4 7409 Gen1
> > > 5 7409 gen3
> > > 6 6511 gen2
> > > 7 6417 gen3
> > > 8 16070 gen4
> > > 9 6511 gen4
> > > > aa<-lapplyBy(~ID, data=ddd,
> > > + FUN=function(uu){
> > > + list(uu$ID[1], paste(uu$gen, collapse=":"))
> > > + })
> > > >
> > > > do.call(rbind,aa)
> > > [,1] [,2]
> > > 42787 42787 "gen2"
> > > 16070 16070 "gen2:gen3:gen4"
> > > 7409 7409 "Gen1:gen3"
> > > 6511 6511 "gen2:gen4"
> > > 6417 6417 "gen3"
> > >
> > > Regards
> > > Søren
> > >
> > >
> > >
> > >
> > >
> > > -----Oprindelig meddelelse-----
> > > Fra: r-help-bounces at r-project.org [
mailto:r-help-bounces at r-project.org
> ] PĂĄ
> > > vegne af Assa Yeroslaviz
> > > Sendt: 8. juli 2010 10:45
> > > Til: r-help at stat.math.ethz.ch
> > > Emne: [R] transformation of data.frame
> > >
> > > Hello all R users,
> > >
> > > I have a problems transforming (or maybe better regrouping) a
> data.frame.
> > > I have a big data.frame, which I would like to sum up according to
a
> > > specific column.
> > >
> > > This is an example of my matrix:
> > > ID gen
> > > 0042787 gen2
> > > 0016070 gen2
> > > 0016070 gen3
> > > 0007409 Gen1
> > > 0007409 gen3
> > > 0006511 gen2
> > > 0006417 gen3
> > > 0016070 gen4
> > > 0006511 gen4
> > >
> > > I want to rearrange the matrix according to column GO, so that it
will
> look
> > > likes that:
> > >
> > > GO:0042787 gen2
> > > GO:0016070 gen2 : gen3 : gen4
> > > GO:0007409 gen1 : gen3
> > > GO:0006511 gen2 : gen4
> > > GO:0006417 gen3
> > >
> > > I've tried it with the package doBy (lapplyBy and paste) but it just
> > > doesn't
> > > work out.
> > >
> > > I will be very happy for any suggestions you might have to help me.
> > >
> > > Thanks
> > >
> > > Assa
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list