[R] parse XML file
Ben Tupper
btupper at bigelow.org
Wed Jun 29 13:57:23 CEST 2011
Hi,
On Jun 29, 2011, at 6:26 AM, Kai Serschmarn wrote:
> Thank you Barry, that works fine.
> Sorry for stupid questions... however, I couldn't manage to get a
> dataframe out of this.
>
> That's what I was doing:
>
> doc = xmlRoot(xmlTreeParse("de.dwd.klis.TADM.xml"))
> dumpData <- function(doc){
> for(i in 1:length(doc)){
> stns = doc[[i]]
> for (j in 1:length(stns)){
> cat(stns$attributes['value'],stns[[j]][[1]]$value,stns[[j]]
> $attributes['date'],"\n")
> }
> }
> }
> dumpData(doc)
>
Perhaps this would work for you. It generates a list of data frames,
one for each station.
###### BEGIN
## start with your doc - split it into a list of nodes (one for each
child)
stn <- xmlChildren(doc)
# converts a station node to a data frame
getMyStation <- function(x){
# get the name of the station
stationName <- xmlAttrs(x)["value"]
# a function to extract the date and value
getMyRecords <- function(x){
date <- xmlAttrs(x)["date"]
val <- xmlValue(x)
y <- c( date, val)
return(y)
}
# for each child, extract the records
r <- lapply(x, getMyRecords)
nR <- length(r)
# bind into one matrix - all characters as this point
y <- do.call(rbind, r)
# make a data.frame
df <- data.frame("Station" = rep(stationName, nR), "date" = y[,1],
"value" = y[,2],
row.names = 1:nR, stringsAsFactors = FALSE)
return(df)
}
# now loop through the station nodes - extract data into a data frame
x <- lapply(stn, getMyStation)
##### END
Cheers,
Ben
Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine 04575-0475
http://www.bigelow.org/
More information about the R-help
mailing list