[BioC] paper - download - pubmed
Chris Stubben
stubben at lanl.gov
Tue Jan 22 18:40:04 CET 2013
Actually what you told is working perfectly for the PMC ids, but not PM ids.
Like if I need to get the PDFs for this PM ids : 10417722, what should I do?
>From my institute, I'm allowed to download papers from various journals,
and the problem is now, I can only get the papers annotated with PMC ids but
not with PM ids.
There are a few ways to get PMC ids from pubmed ids using E-utilities and
the genomes package.
# E-link - for a list of links see
subset( einfo("pubmed", links=TRUE), DbTo=="pmc")
# dbfrom = pubmed by default.
elink(14769935, dbto="pmc", cmd="neighbor", linkname="pubmed_pmc")
[1] 357076 # = PMC357076
# or if no PMC id available
elink(10417722, dbto="pmc", cmd="neighbor", linkname="pubmed_pmc")
numeric(0)
# or use E-fetch and get the abstract - the PMCID is listed before the PMID
and you could use grep to grab that. Again pubmed is the default db
efetch(14769935, rettype="abstract")
[26] "PMCID:
PMC357076"
[27] "PMID: 14769935 [PubMed - indexed for MEDLINE]"
# or get XML from efetch
x <- efetch(14769935, retmode="xml")
doc<-xmlParse(x) # requires XML package
xpathSApply(doc, '//ArticleId[@IdType="pmc"]', xmlValue)
[1] "PMC357076"
If the Pubmed Id is not linked to PMC, you could read the Pubmed results
page and check if there is a link to a full text article from the publisher.
url <- [1]"http://www.ncbi.nlm.nih.gov/pubmed/?term=10417722"
doc <- xmlParse(url)
## the results page includes a namespace, so queries look awful
xpathSApply(doc, '//x:div[@class="icons"]/x:div/x:a', xmlGetAttr, "href",
namespaces = c("x" = [2]"http://www.w3.org/1999/xhtml"))
[1]
[3]"http://onlinelibrary.wiley.com/resolve/openurl?genre=article&sid=nlm:pub
med&issn=0960-7412&date=1999&volume=19&issue=1&spage=9"
You could read that link and find another link to download the pdf , which
is probably different for each publisher...
[4]http://onlinelibrary.wiley.com/doi/10.1046/j.1365-313X.1999.00491.x/pdf
Chris
References
1. http://www.ncbi.nlm.nih.gov/pubmed/?term=10417722
2. http://www.w3.org/1999/xhtml
3. http://onlinelibrary.wiley.com/resolve/openurl?genre=article&sid=nlm:pubmed&issn=0960-7412&date=1999&volume=19&issue=1&spage=9
4. http://onlinelibrary.wiley.com/doi/10.1046/j.1365-313X.1999.00491.x/pdf
More information about the Bioconductor
mailing list