简体   繁体   中英

Python - Deep XML file for loop

I am working with a XML file that looks like the code below, the real one has a lot more spreekbeurt sessions but I made it readable. My goal is to get from all the spreekbeurt sessions the text in the voorvoegsel and achternaam part.

<?xml version="1.0" encoding="utf-8"?>
<officiele-publicatie xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://technische-documentatie.oep.overheid.nl/schema/op-xsd-2012-2">
  <metadata>
    <meta name="OVERHEIDop.externMetadataRecord" scheme="" content="https://zoek.officielebekendmakingen.nl/h-tk-20122013-4-2/metadata.xml" />
  </metadata>

  <handelingen>

      <spreekbeurt nieuw="ja">
        <spreker>
          <voorvoegsels>De heer</voorvoegsels>
          <naam>
            <achternaam>Marcouch</achternaam>
          </naam> (<politiek>PvdA</politiek>):</spreker>
        <tekst status="goed">
          <al>Sample Text</al>
        </tekst>
      </spreekbeurt> 

    </agendapunt>
  </handelingen>
</officiele-publicatie>

I use a for loop to loop through all the spreekbeurt elemets in my XML file. But how do I print out the voorvoegsels and achternaam for every spreekbeurt in my XML file?

import xml.etree.ElementTree as ET
tree = ET.parse('...\directory')
root = tree.getroot()

for spreekbeurt in root.iter('spreekbeurt'):
    print spreekbeurt.attrib

This code prints:

{'nieuw': 'nee'}
{'nieuw': 'ja'}
{'nieuw': 'nee'}
{'nieuw': 'nee'}

but how do I get the children printed out of the spreekbeurt ?

Thanks in advance!

You can use find() passing path* to the target element to find individual element within a parent/ancestor, for example :

>>> for spreekbeurt in root.iter('spreekbeurt'):
...     v = spreekbeurt.find('spreker/voorvoegsels')
...     a = spreekbeurt.find('spreker/naam/achternaam')
...     print v.text, a.text
...
De heer Marcouch

*) in fact it supports more than just simple path, but subset of XPath 1.0 expressions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM