Suppose I have an etree as following:
my_data.xml
<?xml version="1.0" encoding="UTF-8"?>
<data>
<country name="Liechtenstein" xmlns="aaa:bbb:ccc:liechtenstein:eee">
<rank updated="yes">2</rank>
<holidays>
<christmas>Yes</christmas>
</holidays>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore" xmlns="aaa:bbb:ccc:singapore:eee">
<continent>Asia</continent>
<holidays>
<christmas>Yes</christmas>
</holidays>
<rank updated="yes">5</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama" xmlns="aaa:bbb:ccc:panama:eee">
<rank updated="yes">69</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
<ethnicity xmlns="aaa:bbb:ccc:ethnicity:eee">
<malay>
<holidays>
<ramadan>Yes</ramadan>
</holidays>
</malay>
</ethnicity>
</data>
Parsing:
xtree = etree.parse('my_data.xml')
xroot = xtree.getroot()
I want to traverse through the tree and do stuff to all branches, except certain brances. In this example, I want to exclude the ethnicity
branch:
node_to_exclude = xroot.xpath('.//*[local-name()="ethnicity"]')
exclude_path = xtree.getelementpath(node_to_exclude[0])
for element in xroot.iter('*'):
if exclude_path not in xtree.getelementpath(element ):
...do stuff...
But this will still traverse through the entire tree. Is there any better / faster way than this (ie ignore the entire ethnicity
branch together)? I m looking for a syntactical solution, not a recursive algorithm.
XPath can do this for you
for element in xroot.xpath('.//*[not(ancestor-or-self::*[local-name()="ethnicity"])]'):
# ...do stuff...
It might - or might not, measure it - improve performance to specify which ancestor you mean. For example, if <ethnicity xmlns="...">
always is a child of the top-level element, ie "the penultimate ancestor", you could do this:
for element in xroot.xpath('.//*[not(ancestor-or-self::*[last()-1][local-name()="ethnicity"])]'):
# ...do stuff...
Of course you can also do something like:
for child in xroot.getchildren()
if 'ethnicity' in child.tag:
continue
for element in child.xpath('//*'):
# ...do stuff...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.