[R] (simple) xml into data.frame and reverse
Duncan Temple Lang
duncan at wald.ucdavis.edu
Wed Jul 15 06:58:55 CEST 2009
stefan.duke at gmail.com wrote:
> Hello,
> I am trying to convert a simple data.frame (it will always be a few
> equally long variables) into the XML format (which I don't understand
> too well but need as input for another program) and reverse the
> operation (from XML back into data.frame).
> I found some code which does the first and it works good enough for me
> (see below). Is there an easy way to reverse the operation?
> My XML
> files are nothing fancy (no child nodes or anything, at least as far
> as I can see.
Just for the record, there are child nodes.
You have a top-level node <populationsize>
This has several children <size>. And each of these has
<age>, <sex> and <number> as children.
You don't sub-nodes of these so the hierarchy is relatively flat
and does correspond to a data frame with each <size> node
as an observation and <age>, <sex> and <number> as variables/columns.
I wrote some relatively general functions, but hastily written functions
to read this sort of data. You can find them attached or at
http://www.omegahat.org/RSXML/xmlToDataFrame.xml
You can use these as
xmlToDataFrame("size.xml")
It handles homogeneous and non-homogeneous nodes
(i.e. with the same number and names of sub-nodes or not)
and also allows one to specify colClasses somewhat similar
to that in read.table() ( but not completely implemented yet).
These functions will most likely be in the next release of the XML
package.
Let me know if they don't work for your data.
D.
>
>
> ### data.frame
> data<- as.data.frame(cbind(c( 0 , 1 ),c( 500 , 300),c(200, 400)))
> names(data)<-c("age","0","1")
>
> ### converts data.frame into XML
> xml <- xmlTree()
> xml$addTag("populationsize", close=FALSE)
> for (i in 1:nrow(data)) {
> xml$addTag("size", close=FALSE)
> for (j in names(data)) {
> xml$addTag(j, data[i, j])
> }
> xml$closeTag()
> }
> xml$closeTag()
>
> # view the result
> cat(saveXML(xml))
>
> I put below also an example of how my data looks like.
> Thanks for any advice!
> Best and have a great day,
> Stefan
>
>
>
>
> APPENDIX
> XML-file
> ------------------
>
>
> <populationsize>
> <size>
> <age>0</age>
> <sex>0</sex>
> <number>500</number>
> </size>
> <size>
> <age>0</age>
> <sex>1</sex>
> <number>300</number>
> </size>
> <size>
> <age>1</age>
> <sex>0</sex>
> <number>200</number>
> </size>
> <size>
> <age>1</age>
> <sex>1</sex>
> <number>400</number>
> </size>
> </populationsize>
>
> ---------
> DATAFRAME
>
> age 0 1
> 0 500 300
> 1 200 400
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list