[BioC] Bsgenome Zv7 sequence data

hpages at fhcrc.org hpages at fhcrc.org
Wed Jul 29 06:41:57 CEST 2009


Hi Julie,

The Zebrafish (Danio rerio) genome is now available in BioC release
and will soon be in BioC devel:

> library(BSgenome)
> available.genomes()
  [1] "BSgenome.Amellifera.BeeBase.assembly4"
  [2] "BSgenome.Amellifera.UCSC.apiMel2"
  [3] "BSgenome.Athaliana.TAIR.01222004"
  [4] "BSgenome.Athaliana.TAIR.04232008"
  [5] "BSgenome.Btaurus.UCSC.bosTau3"
  [6] "BSgenome.Btaurus.UCSC.bosTau4"
  [7] "BSgenome.Celegans.UCSC.ce2"
  [8] "BSgenome.Cfamiliaris.UCSC.canFam2"
  [9] "BSgenome.Dmelanogaster.UCSC.dm2"
[10] "BSgenome.Dmelanogaster.UCSC.dm3"
[11] "BSgenome.Drerio.UCSC.danRer5"
[12] "BSgenome.Ecoli.NCBI.20080805"
[13] "BSgenome.Ggallus.UCSC.galGal3"
[14] "BSgenome.Hsapiens.UCSC.hg17"
[15] "BSgenome.Hsapiens.UCSC.hg18"
[16] "BSgenome.Hsapiens.UCSC.hg19"
[17] "BSgenome.Mmusculus.UCSC.mm8"
[18] "BSgenome.Mmusculus.UCSC.mm9"
[19] "BSgenome.Ptroglodytes.UCSC.panTro2"
[20] "BSgenome.Rnorvegicus.UCSC.rn4"
[21] "BSgenome.Scerevisiae.UCSC.sacCer1"

> source("http://bioconductor.org/biocLite.R")
> biocLite("BSgenome.Drerio.UCSC.danRer5")
...
> library(BSgenome.Drerio.UCSC.danRer5)
> Drerio
Zebrafish genome
|
| organism: Danio rerio (Zebrafish)
| provider: UCSC
| provider version: danRer5
| release date: Jul. 2007
| release name: Sanger Institute Zv7
|
| single sequences (see '?seqnames'):
|   chr1   chr2   chr3   chr4   chr5   chr6   chr7   chr8   chr9    
chr10  chr11
|   chr12  chr13  chr14  chr15  chr16  chr17  chr18  chr19  chr20   
chr21  chr22
|   chr23  chr24  chr25  chrM
|
| multiple sequences (see '?mseqnames'):
|   Zv7_NA        Zv7_scaffold  upstream1000  upstream2000  upstream5000
|
| (use the '$' or '[[' operator to access a given sequence)

> Drerio$chr1
   56204684-letter "MaskedDNAString" instance (# for masking)
seq:  
CACACACTCATACACTACGGCCAGTGTAGTTGATCA...GGAGGATCTGACGTCTGTGAGCAAACACAAACACAC
masks:
   maskedwidth  maskedratio active names                               desc
1      150400 2.675934e-03   TRUE AGAPS                      assembly gaps
2         288 5.124128e-06   TRUE   AMB           intra-contig ambiguities
3    26544901 4.722898e-01  FALSE    RM                       RepeatMasker
4     1576324 2.804613e-02  FALSE   TRF Tandem Repeats Finder [period<=12]
all masks together:
   maskedwidth maskedratio
      26736688   0.4757021
all active masks together:
   maskedwidth maskedratio
        150688 0.002681058

Cheers,
H.


Quoting Julie Zhu <julie.zhu at umassmed.edu>:

> Hi Herve,
>
> I need to obtain sequence data for a set of given coordinates. Do you know
> whether Zv7 (zebrafish) sequence data will be made available for Bsgenome
> package? Thanks!
>
> Best regards,
>
> Julie
>
>
> *******************************************
> Julie Zhu, Ph.D
> Research Assistant Professor
> Program Gene Function and Expression
> University of Massachusetts Medical School
> 364 Plantation Street, Room 613
> Worcester, MA 01605
> 508-856-5256
> http://www.umassmed.edu/pgfe/faculty/zhu.cfm
>
>
>
>



More information about the Bioconductor mailing list