[R] xmlToDataFrame#Help!!!#follow-up

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jan 10 19:31:38 CET 2010


Try this:

library(XML)
doc <- xmlTreeParse("adodb.xml", useInternalNodes = TRUE)
Lines <- xpathSApply(doc, "//z:row",
	function(x) do.call(paste, as.list(xmlAttrs(x))))
DF <- read.table(textConnection(Lines), col.names =
	xpathSApply(doc, "//s:AttributeType", function(x) xmlAttrs(x)[[1]]))

This is what I get:

> DF
      Name Sex Age Height Weight
1   Alfred   M  14   69.0  112.5
2    Alice   F  13   56.5   84.0
3  Barbara   F  13   65.3   98.0
4    Carol   F  14   62.8  102.5
5    Henry   M  14   63.5  102.5
6    James   M  12   57.3   83.0
7     Jane   F  12   59.8   84.5
8    Janet   F  15   62.5  112.5
9  Jeffrey   M  13   62.5   84.0
10    John   M  12   59.0   99.5
11   Joyce   F  11   51.3   50.5
12    Judy   F  14   64.3   90.0
13  Louise   F  12   56.3   77.0
14    Mary   F  15   66.5  112.0
15  Philip   M  16   72.0  150.0
16  Robert   M  12   64.8  128.0
17  Ronald   M  15   67.0  133.0
18  Thomas   M  11   57.5   85.0
19 William   M  15   66.5  112.0



On Sun, Jan 10, 2010 at 12:59 PM, Christian Ritter <critter at ridaco.be> wrote:
> Dieter Menne pointed out that the (small) xml attachment didn't make it.
> Here is an in-line version (see end of message). Let's hope it works this
> time.
>
> I'm struggling with interpreting XML files created by ADODB as data.frames
> and I'm looking for advice.
>
> Note:
> This xlm contains a result set which comes from a rectangular data array.
> I've been trying to play with parameters to the xmlToDataFrame function
> in the XML package but I dont get it to extract the data frame. Reading the
> file with xmlTreeParse seems to work without error.
>
> This is what the result should look like:
>     Name Sex Age Height Weight
> 1   Alfred   M  14   69.0  112.5
> 2    Alice   F  13   56.5   84.0
> 3  Barbara   F  13   65.3   98.0
> 4    Carol   F  14   62.8  102.5
> 5    Henry   M  14   63.5  102.5
> 6    James   M  12   57.3   83.0
> 7     Jane   F  12   59.8   84.5
> 8    Janet   F  15   62.5  112.5
> 9  Jeffrey   M  13   62.5   84.0
> 10    John   M  12   59.0   99.5
> 11   Joyce   F  11   51.3   50.5
> 12    Judy   F  14   64.3   90.0
> 13  Louise   F  12   56.3   77.0
> 14    Mary   F  15   66.5  112.0
> 15  Philip   M  16   72.0  150.0
> 16  Robert   M  12   64.8  128.0
> 17  Ronald   M  15   67.0  133.0
> 18  Thomas   M  11   57.5   85.0
> 19 William   M  15   66.5  112.
>
> And here is the xml file
> <xml xmlns:s='uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882'
>   xmlns:dt='uuid:C2F41010-65B3-11d1-A29F-00AA00C14882'
>   xmlns:rs='urn:schemas-microsoft-com:rowset'
>   xmlns:z='#RowsetSchema'>
> <s:Schema id='RowsetSchema'>
>   <s:ElementType name='row' content='eltOnly'>
>       <s:AttributeType name='Name' rs:number='1'>
>           <s:datatype dt:type='string' rs:dbtype='str' dt:maxLength='8'
> rs:maybenull='false'/>
>       </s:AttributeType>
>       <s:AttributeType name='Sex' rs:number='2'>
>           <s:datatype dt:type='string' rs:dbtype='str' dt:maxLength='1'
> rs:maybenull='false'/>
>       </s:AttributeType>
>       <s:AttributeType name='Age' rs:number='3' rs:nullable='true'>
>           <s:datatype dt:type='float' dt:maxLength='8' rs:precision='15'
> rs:fixedlength='true'/>
>       </s:AttributeType>
>       <s:AttributeType name='Height' rs:number='4' rs:nullable='true'>
>           <s:datatype dt:type='float' dt:maxLength='8' rs:precision='15'
> rs:fixedlength='true'/>
>       </s:AttributeType>
>       <s:AttributeType name='Weight' rs:number='5' rs:nullable='true'>
>           <s:datatype dt:type='float' dt:maxLength='8' rs:precision='15'
> rs:fixedlength='true'/>
>       </s:AttributeType>
>       <s:extends type='rs:rowbase'/>
>   </s:ElementType>
> </s:Schema>
> <rs:data>
>   <z:row Name='Alfred' Sex='M' Age='14' Height='69' Weight='112.5'/>
>   <z:row Name='Alice' Sex='F' Age='13' Height='56.5' Weight='84'/>
>   <z:row Name='Barbara' Sex='F' Age='13' Height='65.299999999999997'
> Weight='98'/>
>   <z:row Name='Carol' Sex='F' Age='14' Height='62.799999999999997'
> Weight='102.5'/>
>   <z:row Name='Henry' Sex='M' Age='14' Height='63.5' Weight='102.5'/>
>   <z:row Name='James' Sex='M' Age='12' Height='57.299999999999997'
> Weight='83'/>
>   <z:row Name='Jane' Sex='F' Age='12' Height='59.799999999999997'
> Weight='84.5'/>
>   <z:row Name='Janet' Sex='F' Age='15' Height='62.5' Weight='112.5'/>
>   <z:row Name='Jeffrey' Sex='M' Age='13' Height='62.5' Weight='84'/>
>   <z:row Name='John' Sex='M' Age='12' Height='59' Weight='99.5'/>
>   <z:row Name='Joyce' Sex='F' Age='11' Height='51.299999999999997'
> Weight='50.5'/>
>   <z:row Name='Judy' Sex='F' Age='14' Height='64.299999999999997'
> Weight='90'/>
>   <z:row Name='Louise' Sex='F' Age='12' Height='56.299999999999997'
> Weight='77'/>
>   <z:row Name='Mary' Sex='F' Age='15' Height='66.5' Weight='112'/>
>   <z:row Name='Philip' Sex='M' Age='16' Height='72' Weight='150'/>
>   <z:row Name='Robert' Sex='M' Age='12' Height='64.799999999999997'
> Weight='128'/>
>   <z:row Name='Ronald' Sex='M' Age='15' Height='67' Weight='133'/>
>   <z:row Name='Thomas' Sex='M' Age='11' Height='57.5' Weight='85'/>
>   <z:row Name='William' Sex='M' Age='15' Height='66.5' Weight='112'/>
> </rs:data>
> </xml>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list