[R] using xpath with xml2
Ben Tupper
btupper @end|ng |rom b|ge|ow@org
Tue Nov 12 20:34:51 CET 2019
Hi,
I have mined XML extensively with R before now, but my xpath chops seem to be regressing recently. I know that I can roll up my sleeves and search through the child nodes of the root, but I can't noodle out why using the xpath description returns an empty nodeset.
Any suggestions and nudges most welcome.
### START
library(xml2)
library(httr)
library(magrittr)
daymet_uri <- "https://thredds.daac.ornl.gov/thredds/catalog/ornldaac/1328/catalog.xml"
# run the following to show the node in a browser
# httr::BROWSE(daymet_uri)
daymet <- httr::GET(daymet_uri) %>%
httr::content(type = "text/xml", encoding = "UTF-8")
# list the children "service" and "dataset"
daymet %>% xml2::xml_children()
#{xml_nodeset (2)}
#[1] <service name="all" serviceType="Compound" base="">\n <service name="odap" #serviceTyp ...
#[2] <dataset name="Daymet: Daily Surface Weather Data on a 1-km Grid for North America, Ve ...
# find all descendants of node name "dataset"
#
# according to this tutorial we should find 'dataset'
# https://www.w3schools.com/xml/xpath_syntax.asp
daymet %>% xml2::xml_find_all(xpath = "//dataset")
# {xml_nodeset (0)}
# I have also tried every other xpath combination I think of e.g.
# ".//dataset", "./dataset", "/dataset" and "dataset"
# They each yield an empty nodeset
### END
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] magrittr_1.5 httr_1.4.1 xml2_1.2.2
loaded via a namespace (and not attached):
[1] compiler_3.5.1 R6_2.4.0 tools_3.5.1 curl_4.2
[5] yaml_2.2.0 Rcpp_1.0.3
Thanks,
Ben
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org
Ecological Forecasting: https://eco.bigelow.org/
More information about the R-help
mailing list