[BioC] GOstats v2.2.3 . bug in inducedTermGraph
Pierre-Yves Boëlle
boelle at u707.jussieu.fr
Thu May 31 18:44:33 CEST 2007
Using R : 2.5.0 / BioC 2.0 / GOStats 2.2.3 / GO : 1.16 on Windows XP.
the following code reproduces the bug. (quite lengthy... I took the
example from the vignette)
----------------------------------
library("ALL")
library("hgu95av2")
library("annotate")
library("genefilter")
library("GOstats")
library("RColorBrewer")
library("xtable")
library("Rgraphviz")
data(ALL, package = "ALL")
Bcell <- grep("^B", as.character(ALL$BT))
bcrAblOrNegIdx <- which(as.character(ALL$mol) %in% c("NEG", "ALL1/AF4"))
bcrAblOrNeg <- ALL[, intersect(Bcell, bcrAblOrNegIdx)]
bcrAblOrNeg$mol.biol = factor(bcrAblOrNeg$mol.biol)
entrezIds <- mget(featureNames(bcrAblOrNeg), envir = hgu95av2ENTREZID)
haveEntrezId <- names(entrezIds)[sapply(entrezIds, function(x) !is.na(x))]
numNoEntrezId <- length(featureNames(bcrAblOrNeg)) - length(haveEntrezId)
bcrAblOrNeg <- bcrAblOrNeg[haveEntrezId, ]
haveGo <- sapply(mget(featureNames(bcrAblOrNeg), hgu95av2GO),
function(x) { if (length(x) == 1 && is.na(x)) FALSE else TRUE})
numNoGO <- sum(!haveGo)
bcrAblOrNeg <- bcrAblOrNeg[haveGo, ]
iqrCutoff <- 0.5
bcrAblOrNegIqr <- apply(exprs(bcrAblOrNeg), 1, IQR)
selected <- bcrAblOrNegIqr > iqrCutoff
chrN <- mget(featureNames(bcrAblOrNeg), envir = hgu95av2CHR)
onY <- sapply(chrN, function(x) any(x == "Y"))
onY[is.na(onY)] <- FALSE
selected <- selected & !onY
nsFiltered <- bcrAblOrNeg[selected, ]
nsFilteredIqr <- bcrAblOrNegIqr[selected]
uniqGenes <- findLargest(featureNames(nsFiltered), nsFilteredIqr,
"hgu95av2")
nsFiltered <- nsFiltered[uniqGenes, ]
numSelected <- length(featureNames(nsFiltered))
BCRcols = ifelse(nsFiltered$mol == "ALL1/AF4", "goldenrod", "skyblue")
cols = brewer.pal(10, "RdBu")
affyUniverse <- featureNames(nsFiltered)
entrezUniverse <- unlist(mget(affyUniverse, hgu95av2ENTREZID))
if (any(duplicated(entrezUniverse))) stop("error in gene universe: can't
have duplicate EntrezIds")
chipAffyUniverse <- featureNames(bcrAblOrNeg)
chipEntrezUniverse <- mget(chipAffyUniverse, hgu95av2ENTREZID)
chipEntrezUniverse <- unique(unlist(chipEntrezUniverse))
ttestCutoff <- 0.05
ttests = rowttests(nsFiltered, "mol.biol")
smPV = ttests$p.value < ttestCutoff
pvalFiltered <- nsFiltered[smPV, ]
selectedEntrezIds <- unlist(mget(featureNames(pvalFiltered),
hgu95av2ENTREZID))
hgCutoff <- 0.001
params <- new("GOHyperGParams", geneIds = selectedEntrezIds,
universeGeneIds = entrezUniverse, annotation = "hgu95av2",
ontology = "BP", pvalueCutoff = hgCutoff, conditional = FALSE,
testDirection = "over")
hgOver <- hyperGTest(params)
inducedTermGraph(hgOver,names(pvalues(hgOver)[95]),children=FALSE)
----------------------------
fails with :
Erreur dans checkValidNodeName(node) : invalid node names: missing value
NA not allowed
looking at the code (in GOHyperGResults-accessors.R) , it appears that
parents are looked for in the "KidsEnv" instead of "ParentsEnv"
Also, the graphs is reversed? (ancestor at the bottom, children on top)
--
Pierre-Yves Boëlle
Universite Pierre et Marie Curie-Paris6, UMR S 707, Paris, F-75012
INSERM, UMR S 707, Paris, F-75012
AP-HP, hopital St Antoine, Paris, F-75012
Tel : +33 (0)1 49 28 32 26
Fax : +33 (0)1 49 28 32 33
More information about the Bioconductor
mailing list