简体   繁体   中英

how to use iterparse to get xml snippet

I'm trying to parse a large XML KML file. Inside the python script I have the following commands:

import xml.etree.ElementTree as Etree
for event, elem in Etree.iterparse("tracts.kml", events=('start', 'end')):
     if event == 'end' and elem.tag == '{http://www.opengis.net/kml/2.2}MultiGeometry':
          print(elem)

The xml looks like this:

<MultiGeometry>
     <Polygon>
         <Altitude>
         </Altitude>
         <coordinates>
         </coordinates>
     </Polygon>
</MultiGeometry>

What I want is to export the text inside <MultiGeometry></MultiGeometry> to include child tags and text inside each of them.

Meaning the output is a string that looks like: <Polygon>...</Polygon> in a string format.

elem.text only assumes that there is values outside of the childtags. I want all of it. How do I get all of it?

Thanks.

If you want the complete string of an xml element, you can use ElementTree.tostring() function . Please note this returns a byte string, (encoded using the encoding that is passed to the method , default for encoding is 'us-ascii' ) , you will need to decode() the value to get the actual string.

Example -

>>> import xml.etree.ElementTree as ET
>>> r = ET.fromstring('''<MultiGeometry>
...      <Polygon>
...          <Altitude>
...          </Altitude>
...          <coordinates>
...          </coordinates>
...      </Polygon>
... </MultiGeometry>''')
>>> ET.tostring(r)
b'<MultiGeometry>\n     <Polygon>\n         <Altitude>\n         </Altitude>\n         <coordinates>\n         </coordinates>\n     </Polygon>\n</MultiGeometry>'
>>> print(ET.tostring(r).decode())
<MultiGeometry>
     <Polygon>
         <Altitude>
         </Altitude>
         <coordinates>
         </coordinates>
     </Polygon>
</MultiGeometry>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM