[BioC] vmatchPDict?

David Iles D.E.Iles at leeds.ac.uk
Fri Dec 14 12:45:33 CET 2012


Hi,

I need to re-map the probe sequences of the Affymetrix Bovine genome array to a recent draft sequence of the sheep genome (please, don't ask why...). As a first step, I successfully created a new BSgenome package from a seed file, listing individual chromosomes as 'seqnames' and unmapped, and two multiple sequence fasta files as 'mseqnames',  as per the forgeBSgenomeDataPkg vignette (see session info below).

When calling the matchPDict() function to map the probe sequences to the + and - strands of individual chromosomes, all went smoothly, but the following error occurred with multiple sequences:

> runAnConScaff(bt.probes.all, outfile="bt.probes.2.oarv3.1.unmapped.txt")

Target: strand + of Oar v3.1 sequence unmapped_scaffolds, unmapped_contigs 
>>> Finding all hits in strand + of sequence unmapped_scaffolds ...
Error in matchPDict(pdict, subject) : 
  please use vmatchPDict() when 'subject' is an XStringSet object (multiple sequence)

So, I edited my script to call vmatchPDict() instead, with the following result....

> runAnConScaff(bt.probes.all, outfile="bt.probes.2.oarv3.1.unmapped.txt")

Target: strand + of Oar v3.1 sequence unmapped_scaffolds, unmapped_contigs 
>>> Finding all hits in strand + of sequence unmapped_scaffolds ...
Error in .local(pdict, subject, max.mismatch, min.mismatch, with.indels,  : 
  vmatchPDict() is not ready yet, sorry

While I can work around this by splitting the multiple sequences into loads of small fasta files, each with a single sequence, I wondered, will the vmatchPDict() function be ready in the not-too-distant future? 

Many thanks

Dr David Iles
School of Biology
University of Leeds
Leeds LS2 9JT

d.e.iles at leeds.ac.uk

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BSgenome.Oaries.ISGC.Oarv3.1 BSgenome_1.26.1                  Biostrings_2.26.2               
[4] GenomicRanges_1.10.5             IRanges_1.16.4                   BiocGenerics_0.4.0              

loaded via a namespace (and not attached):
[1] parallel_2.15.2 stats4_2.15.2   tools_2.15.2   
> 



More information about the Bioconductor mailing list