[BioC] AnnotationDbi: NewSchema vignette error
Marc Carlson
mcarlson at fhcrc.org
Wed Jul 4 00:49:28 CEST 2012
Hi Mark,
I actually was thinking about maybe deprecating this vignette. The
vignette is meant to show you how to make bimaps etc. for unsupported
organisms, but as you discovered it is a bit complicated to implement
this stuff. And these days, bimaps are getting to be a bit "legacy".
That is, I am not going to make them go away, but we are no longer
interested in adding them to everything. We have added a newer
interface that we think is a bit simpler to use.
So because of all that, I think that for this situation it is probably
preferable to just see the new intro vignette here:
http://www.bioconductor.org/packages/2.10/bioc/vignettes/AnnotationDbi/inst/doc/IntroToAnnotationPackages.pdf
That vignette shows the newer way of accessing the annotations, and it
also discusses how to implement the new way of accessing the data. And
for non-model organisms, I would probably just recommend implementing
these four new methods: select, keys, keytypes and cols (instead of a
series of bimaps). It is probably less work and it is also an easier
interface for new users.
What do you think? Would that help you out or do you really need to
implement bimaps? If you really need that, please let me know and lets
discuss it.
Marc
On 07/02/2012 10:53 PM, Mark Cowley wrote:
> Dear list,
> I just ran through the 'Creating an annotation package with a new database schema' vignette within AnnotationDbi, verbatim, and hit an error in 'code chunk 22', where calling:
>> makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
> Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
> negative extents to matrix
>
> The code's pretty complicated, so I was hoping one of the maintainers could take a look.
>
> method:
> copy& paste each code block within: http://www.bioconductor.org/packages/2.10/bioc/vignettes/AnnotationDbi/inst/doc/NewSchema.R
> including the commented out code blocks re downloading targetscan and populating the database:
>
> ...
> ###################################################
> ### code chunk number 22: build new package (eval = FALSE)
> ###################################################
>> seed<- new("AnnDbPkgSeed",
> Package = package,
> Version = "5.0-1",
> PkgTemplate = packagedir,
> AnnObjPrefix = "targetscan.Hs.eg",
> Title = "TargetScan miRNA target predictions for human",
> Author = "Gabor Csardi<Gabor.Csardi at foo.bar>",
> Maintainer = "Gabor Csardi<Gabor.Csardi at foo.bar>",
> organism = "Homo sapiens",
> species = "Human",
> biocViews = "AnnotationData, FunctionalAnnotation",
> DBschema = "TARGETSCAN_DB",
> AnnObjTarget = "TargetScan (Human)",
> manufacturer = "None",
> manufacturerUrl = "None"
> )
>
>> unlink(paste(tempdir(), sep=.Platform$file.sep, package), recursive=TRUE)
>> makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
> # Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
> # negative extents to matrix
>> traceback()
> # 4: matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr,
> # dimnames = list(rn, cn))
> # 3: Ops.data.frame(type, "ChipDb")
> # 2: makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
> # 1: makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
>> options(error=recover)
>> makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
> # Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
> # negative extents to matrix
> #
> # Enter a frame number, or 0 to exit
> #
> # 1: makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
> # 2: makeAnnDbPkg(seed, dbfile, dest_dir = tempdir())
> # 3: Ops.data.frame(type, "ChipDb")
> # 4: matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, dimn
> #
> # Selection: 3
> # Called from: Ops.data.frame(type, "ChipDb")
> # Browse[1]> value
> # [[1]]
> # logical(0)
>
>
>
> I did this twice, from scratch, with the same result, in both R 2.15.0, and R 2.15.1
>
>
>
> Could you please also make these 2 changes to the vignette as well:
> 1) code chunk 18, final block has a tab character hiding in this line:
> "INSERT INTO targets\tVALUES(:miR_Family,"
>
> 2) code chunk 15: zip.file.extract is defunct as of R 2.15.1 (but was fine in R 2.15.0). I've updated this code block:
>
> targetfile<- file.path(tempdir(), "Predicted_Targets_Info.txt")
> targetfile.zip<- paste(targetfile, sep="", ".zip")
> if (!file.exists(targetfile)) {
> if (!file.exists(targetfile.zip)) {
> data.url<- paste(sep="", "http://www.targetscan.org/vert_50/",
> "vert_50_data_download/Predicted_Targets_Info.txt.zip")
> download.file(data.url, destfile=targetfile.zip)
> }
> unzip( targetfile.zip, basename(targetfile), exdir=tempdir() )
> file.exists(targetfile) || warning("Didn't extract targetfile from zip")
> }
>
> familyfile<- file.path(tempdir(), "miR_Family_Info.txt")
> familyfile.zip<- paste(familyfile, sep="", ".zip")
> if (!file.exists(familyfile)) {
> if (!file.exists(familyfile.zip)) {
> data.url<- paste(sep="", "http://www.targetscan.org/vert_50/",
> "vert_50_data_download/miR_Family_Info.txt.zip")
> download.file(data.url, destfile=familyfile.zip)
> }
> unzip( familyfile.zip, basename(familyfile), exdir=tempdir() )
> file.exists(familyfile) || warning("Didn't extract familyfile from zip")
> }
>
> taxfile<- file.path(getwd(), "names.dmp")
> taxfile.zip<- file.path(tempdir(), "taxdmp.zip")
> if (!file.exists(taxfile)) {
> if (!file.exists(taxfile.zip)) {
> data.url<- "ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdmp.zip"
> download.file(data.url, destfile=taxfile.zip)
> }
> unzip( taxfile.zip, "names.dmp", exdir=tempdir() )
> file.exists(taxfile) || warning("Didn't extract taxfile from zip")
> }
>
>
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] RSQLite_0.11.1 DBI_0.2-5 AnnotationDbi_1.18.1
> [4] Biobase_2.16.0 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] IRanges_1.14.4 stats4_2.15.1 tools_2.15.1
>
>
> cheers,
> Mark
>
> -----------------------------------------------------
> Mark Cowley, PhD
>
> Pancreatic Cancer Program | Peter Wills Bioinformatics Centre
> Garvan Institute of Medical Research, Sydney, Australia
> -----------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list