[R] SOLVED: importing huge XML-Files -- new problem: special	characters
    Alexander Heidrich 
    alexander.heidrich at uni-jena.de
       
    Tue Sep  4 18:17:14 CEST 2007
    
    
  
Hi all,
thanks to the people who replied to my question! I finally solved the  
issue by writing own handlers and using xmlEventParse - which leads  
to the following problem which is so odd that its probably a bug.
I use several special charachter in my XML-File, e.g. umlauts or ° or  
µ - but no matter how I encode my XML (UTF or ISO) or I escape these  
characters xmlEventParse always stops parsing after the first umlaut  
and pretends to have more than one node even if there is really just  
one!
Example:
<locations>abc	aböcd	abdec</locations>
causes two events for locations and produces output in the form of:
	[,1]	[,2]	[,3]
[1,]	abc
[2,]	aböcd	abdec
Should it be like that? If I remove the umlauts, than everything is  
fine!
If I do the following:
<locations>öabc	aböcd	abdec</locations>
the output is
	[,1]	[,2]	[,3]
[1,]	öabc	aböcd	abdec
Any suggestions?
Thanks in advance and many greetings!
Alex
    
    
More information about the R-help
mailing list