[BioC] VariantAnnotation - MatrixToSnpMatrix - only returns NAs
Valerie Obenchain
vobencha at fhcrc.org
Wed Jan 23 06:13:57 CET 2013
Hi Lavinia,
If you can use the development branch MatrixToSnpMatrix() has been
replaced by genotypeToSnpMatrix(). This is much more full featured and
robust function. However if you are using the release branch you still
need to work with MatrixToSnpMatrix(). If this is the case, please read
the man page at
?MatrixToSnpMatrix
This page outlines the cases for which the values will be NA. You
should be seeing warnings such as 'only diploid calls are included',
'only single nucleotide variants are included' or 'variants with >1 ALT
allele are set to NA'. If you are not seeing such warnings, please send
me a small sample of your VCF so I can reproduce this problem.
Valerie
On 01/22/13 17:35, Lavinia Gordon wrote:
> Hi, I have just started working with VCF files and have discovered the VariantAnnotation package, many thanks for making these functions available.
> Following the code outlined in the reference manual for MatrixToSnpMatrix, my VCF returns only NA values:
>> head(geno(vcf)$GT)
> GHS008 GHS015 GHS025 GHS026 GHS027 GHS031 GHS033 GHS034 GHS036
> chrM:73 "1/1" "0/0" "1/1" "0/0" "0/0" "1/1" "0/0" "0/0" "0/0"
> chrM:119 "0/0" "0/0" "0/0" "1/1" "1/1" "0/0" "0/0" "0/0" "0/0"
> rs72619361 "0/0" "1/1" "0/0" "0/0" "0/0" "0/0" "1/1" "1/1" "1/1"
> chrM:150 "1/1" "1/1" "1/1" "1/1" "1/1" "1/1" "1/1" "1/1" "1/1"
> chrM:189 "0/0" "0/0" "0/0" "1/1" "1/1" "0/0" "0/0" "0/0" "0/0"
> chrM:195 "1/1" "1/1" "1/1" "0/0" "0/0" "1/1" "1/1" "1/1" "1/1"
>> head(t(as(mat$genotype, "character")))
> GHS008 GHS015 GHS025 GHS026 GHS027 GHS031 GHS033 GHS034 GHS036
> chrM:73 "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
> chrM:119 "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
> rs72619361 "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
> chrM:150 "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
> chrM:189 "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
> chrM:195 "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
>
> I have run the reference manual code with the supplied VCF and it all looks good.
> I have no reason to suspect that there is anything wrong with my VCF.
> Could anyone give me any tips as to how I can troubleshoot this and work out why all the NAs are appearing?
>
> Many thanks,
>
> Lavinia Gordon
> Senior Research Officer
> Quantitative Sciences Core, Bioinformatics
>
> Murdoch Childrens Research Institute
> The Royal Children's Hospital
> Flemington Road Parkville Victoria 3052 Australia
> T 03 8341 6221
> www.mcri.edu.au
>
>> vcf
> class: VCF
> dim: 4665545 9
> genome: hg19
> exptData(1): header
> fixed(4): REF ALT QUAL FILTER
> info(19): AC AF ... SB EFF
> geno(5): AD DP GQ GT PL
> rownames(4665545): chrM:73 chrM:119 ... chrUn_gl000249:14244
> chrUn_gl000249:16222
> rowData values names(1): paramRangeID
> colnames(9): GHS008 GHS015 ... GHS034 GHS036
> colData names(1): Samples
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] snpStats_1.8.1 Matrix_1.0-10 lattice_0.20-13
> [4] survival_2.37-2 VariantAnnotation_1.4.6 Rsamtools_1.10.2
> [7] Biostrings_2.26.2 GenomicRanges_1.10.6 IRanges_1.16.4
> [10] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0
> [4] bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5
> [7] GenomicFeatures_1.10.1 grid_2.15.2 parallel_2.15.2
> [10] RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.2
> [13] stats4_2.15.2 tools_2.15.2 XML_3.95-0.1
> [16] zlibbioc_1.4.0
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list