[BioC] VariantAnnotation: Specifying 'seqinfo' at import with	'readVcf'
    Valerie Obenchain 
    vobencha at fhcrc.org
       
    Tue Sep 24 18:31:00 CEST 2013
    
    
  
Hi Julian,
On 09/24/2013 02:29 AM, Julian Gehring wrote:
> Hi,
>
> Is there a direct way to specifiy the 'seqinfo' of a genome for the
> import of a VCF file using 'readVcf'?
I think the question is how to read in a subset of chromosomes/positions 
from a vcf file without an accompanying tabix index. You can't. 
readVcf() requires an index when subsets are defined by 
chromosome/position. However you can read in subsets defined by INFO 
and/or GENO fields without an index.
Approaches:
(1) create index with ?indexTabix and specify 'which' in ScanVcfParam
(2) use ?filterVcf to write out a new file of records of interest
> I'm aware that one can change it
> with the 'seqinfo' method afterwards, but for large VCF files this can
> take a significant amount of time.
What operation is taking along time? Subsetting the VCF object by 
chromosome?
Valerie
>
> An alternative would be to sneak it in by the 'which' arguments, such as:
>
> readVcf(file, genome, ScanVcfParam(which = as(seq_info, "GRanges")))
>
> but this requires the file to be indexed beforehand.
>
> Best wishes
> Julian
>
    
    
More information about the Bioconductor
mailing list