[BioC] dbSNP build for R package SNPlocs.Hsapiens.dbSNP.20080617

Hervé Pagès hpages at fhcrc.org
Wed Jun 3 22:05:30 CEST 2009


Hi Lin,

I'm cc'ing the BioC list so other users might benefit from this.

Lin Tang wrote:
> Dear Dr. Pages,
> 
>  
> 
> 
>   I am using R package SNPlocs.Hsapiens.dbSNP.20080617 currently, I want
>   to check with you that whether this package corresponds to dbSNP build
>   129 ? Although from the release date of this R package which is two
>   months after the release of dbSNP build 129, it is logical to be so. I
>   want to have it confirmed from you. I’d appreciate your kind reply on
>   this. Thanks!

It's hard to tell.

According to these pages:
   http://www.ncbi.nlm.nih.gov/mailman/pipermail/dbsnp-announce/2008q2/000081.html
   http://www.ncbi.nlm.nih.gov/projects/SNP/buildhistory.cgi
Build 129 was released in April 2008 (note that the exact dates found on these
2 pages don't match).

A similar research shows that Build 130 was released about 1 month ago.

So at the time I downloaded the ds_flat_ch*.flat files from here
   ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/ASN1_flat
in order to build SNPlocs.Hsapiens.dbSNP.20080617 (that was in March
2009), I assume that these files were a dump from Build 129.

Note that the files under
   ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/ASN1_flat
can change at anytime (and today they are indeed different from what they
were back in March). It's a sad thing that the SNP team at NCBI doesn't
provide permanent URLs for their past builds. And it doesn't help that
the ds_flat_ch*.flat files they provide don't contain any information
about the build that they're coming from.

Anyway, in the future I'll put the Build information in the DESCRIPTION
file of the SNPlocs packages.

One last note. According to the SNP team at NCBI "Human SNPs in Build 129
are mapping to NCBI build 36.3". That is, to our BSgenome.Hsapiens.UCSC.hg18
package. According to UCSC, hg18 is NCBI Build 36.1 but NCBI Build 36.1 and
NCBI Build 36.3 are identical from a *sequence* point of view (I think what
makes them different are the annotations provided by NCBI).
This means that, if you are planning to inject SNPlocs.Hsapiens.dbSNP.20080617
in a genome, it only makes sense to do it with BSgenome.Hsapiens.UCSC.hg18.

In the future we will put in place a mechanism to make this injection safer
i.e. check that the injected stuff and the host are compatible.

Cheers,
H.


> 
> 
>   Regards,
> 
> Lin Tang, Ph.D.
> 
> Scientist , Informatics | Sequenom  Inc.
> 
> T: 1 858 202 9106 | F: 1 858 202 9084 | E: ltang at sequenom.com
> 
>  
> 
> 
> 
> THIS EMAIL MESSAGE IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND 
> MAY CONTAIN CONFIDENTIAL INFORMATION. ANY UNAUTHORIZED REVIEW, USE, 
> DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED 
> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY EMAIL AND DESTROY ALL 
> COPIES OF THE ORIGINAL MESSAGE.
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list