[BioC] plotting a CA

aedin culhane aedin at jimmy.harvard.edu
Fri Mar 9 23:20:30 CET 2012


Hi Aoife
xax and yax are the axes, so xax =1 and yax =2 plots the first 2 
components (or axes)
Aedin

On 03/09/2012 05:09 PM, aoife doherty wrote:
> O wow i was way off. Many thanks.
> May i ask one question (I'm a total newbie), i was trying out the
> different pieces of (much appreciated) code because i want to play
> around with them and make sure i understand them.
>
> But i have never used a function in R.
>
> For this section:
>
> xax =1,  yax = 2,  .. of this line
> plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE,
> plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1,  yax = 2,  ...) {
>
> may i just ask what they represent?
>
> I am trying to work out how everything works by copy and pasting each
> line into R, and then seeing what happens, but for that line i keep getting:
>
>  > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"))
> Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim =
> xlim,  :
>    Non convenient selection for xax
>  > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"),
> plotrowLabels=TRUE)
> Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim =
> xlim,  :
>    Non convenient selection for xax
>  > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue"))
> Error in `[.data.frame`(dfxy, , xax) : undefined columns selected
>
> Deeply indebted.
> Aoife
>
> On Fri, Mar 9, 2012 at 5:49 PM, aedin culhane <aedin at jimmy.harvard.edu
> <mailto:aedin at jimmy.harvard.edu>> wrote:
>
>     Hi Tim, Aoife and Susan
>
>     Sorry Tim, I didn't know that I said not to use made4. When did I
>     say this? I may have said I need to update some of the functions as
>     I wrote the made4 package many years ago.
>
>     Susan, made4 calls ade4 but is designed to convert microarray and
>     other Bioconductor data classes into formats that can be input into
>     ade4. It calls ade4 (and other) plot functions but with more
>     sensible defaults for genomics data (ie it doesn't label all of the
>     objects!).  When I implemented the package I did it with Guy and
>     Jean who wrote the paper you cited and I wholeheartedly agree with
>     all you say ;-)
>
>
>     However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used
>     for what you want.  This will plot rows 4 and 5 as supplementary
>     plots onto the plot. These points won't be used in the computation
>     of the analysis and thus would provide what you want.  Have a look
>     at these plots
>
>     ### ------------------------------__--------------
>     ##  From here, you can copy/paste everything to R
>     ##----------------------------__--------------------
>
>
>     ## Your data... I renamed it, as table is a function in R
>
>     codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,  8, 8,
>     10, 7),  ncol=3, dimnames = list(c("gene1","gene2", "gene3",
>     "gene4", "gene5"), c("codon1", "codon2","codon3")))
>
>     library(ca)
>     codonCA<-ca(codonData)
>
>     ## Draw 2 plots, one with results of analysis of all the data,
>     # the other as you described
>
>     par(mfrow=c(1,2))
>     plot(ca(codonData,suprow=c(4,__5)))
>     plot(codonCA)
>
>     ## You will notice that the 2 plots are very different,
>     ## one analysis is a CA of all 5 rows, the other is only 3 rows.
>
>
>     ## To run a CA on a dataset using made4 or ade4, use the following code
>
>     ## install made4
>     ## source("http://bioconductor.__org/biocLite.R
>     <http://bioconductor.org/biocLite.R>")
>     ## biocLite("made4")
>
>     library(made4)
>
>     ## example dataset
>     data(khan)
>     df<-khan$train
>
>     ## The function ord will run PCA, CA or NSC,
>     ## by default it runs CA (by calling dudi.coa from ade4)
>
>     myCA<- ord(df)
>     plot(myCA)
>     plotgenes(myCA)
>     plotarrays(myCA)
>
>
>     ## using the ade4 library
>     library(ade4)
>     codonCA<-dudi.coa(codonData, scan=FALSE)
>     scatter(codonCA)
>
>
>     ## However neither of these will do exactly as you wish
>     ## made4 expects groups in the column not the rows (genes x samples)
>
>     library(made4)
>     codonCA<-ord(t(codonData))
>
>     ## Create a factor which list the groups of "nodes" of interest
>     fac<-factor(c(rep("Node1",3), rep("Node2", 2)))
>     fac
>     plot(codonCA, , classvec=fac)
>
>
>
>     ## but the function below will do what you need.
>
>
>     plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE,
>     plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1,  yax = 2,
>       ...) {
>
>       require(made4)
>
>       fac2char<-function(fac, newLabels) {
>            cLab<- class(newLabels)
>            if (!length(levels(fac))==length(__newLabels)) stop("Number
>     does not equal to number of factor levels")
>            vec<-as.character(factor(fac, labels=newLabels))
>            if(inherits(newLabels, "numeric")) vec<-as.numeric(vec)
>            return(vec)
>            }
>
>
>       if (plotgroups)  s.groups(dudi$li, fac,  col=cols)
>       if (!plotgroups) {
>         pchs<-fac2char(rowFac, pch)
>         cols<-fac2char(rowFac, cols)
>
>
>         if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs,
>     col=cols, cpoint=2, clabel=0, xax=xax, yax=yax,  ...)
>         if (plotrowLabels)  s.var(dudi$li, boxes=FALSE, col=cols,
>       xax=xax, yax=yax,  ...)
>       }
>
>       s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE,
>     xax=xax, yax=yax,  ...)
>     }
>
>     ##----------------------------__----------------
>     ## Examples: Function has 3 different options
>     ##----------------------------__---------------
>
>     library(ade4)
>     codonCA<-dudi.coa(codonData, scan=FALSE)
>
>     ## Option 1, plot a biplot (cases and samples) with point
>     ## colored by rowFAC
>
>     plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"))
>
>     ## Option 2. Same plot as above, but with labels rather than points
>
>     plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"),
>     plotrowLabels=TRUE)
>
>     ## Option 3, Same plot but put a circle around the groups
>     ## If you look at the help page for s.groups (in made4)
>     ## which calls s.class (in ade4) you will see you can also
>     ## change the size and other details about the
>     ## ellipse (or circle drawn around the groups)
>
>     plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue"))
>
>
>
>
>
>     On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty
>     <aoife.m.doherty at gmail.com <mailto:aoife.m.doherty at gmail.com>>__wrote:
>
>      > Many thanks. I tried this:
>      >
>      > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,
>      >    8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1",
>      > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2",
>      > "codon3")))
>      >
>      > library(ca)
>      >
>      > plot(ca(table,suprow=c(4,5)))
>      >
>      > This will give me a ca plot, where the nodes of interest 4,5 are open
>      > circles.
>      >
>      > However i have two questions.
>      >
>      > 1. Is it possible instead of manually typing in 4 and 5 to
>     somehow get R to
>      > read in a list of nodes of interest. Basically is it possible to
>     change:
>      >
>      > c(4,5) to c(all the nodes that are in a file)
>      >
>      > and
>      >
>      > 2. Is it possible instead of the individual nodes of interest
>     being open
>      > circles, if the area encompassing all the nodes of interest could
>     be shaded
>      > differently/highlighted.
>      > i THINK this is where your suggestion of:
>      >
>      > Your best bet is to use the package ade4
>      > using res=dudi.coa(data)
>      > then
>      > s.class(res$li,group)
>      > where group is your grouping variable you want to highlight.
>      >
>      > comes in, but i am completely new at R, i have genuinely tried to
>      > understand the packages from the manual, I am confused however.
>      >
>      > Aoife
>      >
>      >
>      >
>      >
>      >
>
>     --
>     Aedin Culhane
>     Computational Biology and Functional Genomics Laboratory
>     Harvard School of Public Health,
>     Dana-Farber Cancer Institute
>
>     web: http://www.hsph.harvard.edu/__research/aedin-culhane/
>     <http://www.hsph.harvard.edu/research/aedin-culhane/>
>     email: aedin at jimmy.harvard.edu <mailto:aedin at jimmy.harvard.edu>
>     phone: +1 617 632 2468 <tel:%2B1%20617%20632%202468>
>     Fax: +1 617 582 7760 <tel:%2B1%20617%20582%207760>
>
>
>     Mailing Address:
>     Attn: Aedin Culhane, SM822C
>     450 Brookline Ave.
>     Boston, MA 02215
>
>


-- 
Aedin Culhane
Computational Biology and Functional Genomics Laboratory
Harvard School of Public Health,
Dana-Farber Cancer Institute

web: http://www.hsph.harvard.edu/research/aedin-culhane/
email: aedin at jimmy.harvard.edu
phone: +1 617 632 2468
Fax: +1 617 582 7760


Mailing Address:
Attn: Aedin Culhane, SM822C
450 Brookline Ave.
Boston, MA 02215



More information about the Bioconductor mailing list