简体   繁体   中英

How to parse kml file as tree structure in python?

I have a kml file that I am trying to parse in python,what I wanted is to pass val3 from SimpleData as an argument,and the coordinates for only that placemark tag will be retrieved,I have worked on xpath before: A typical example in xpath expression would be:

value = '..'
for val in (//Placement/ExtendedData/SimpleData[contains(text(), "+value+")]):
    print val.find_element_by_xpath(//coordinates)

However I can't seem to get the same by using element tree in Python:

This is the kml file
:

<?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Schema name="" id="">
    <SimpleField name="NAME_0" type="string"></SimpleField>
    <SimpleField name="NAME_1" type="string"></SimpleField>
</Schema>
<Folder><name></name>
  <Placemark>
    <Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
    <ExtendedData><SchemaData schemaUrl="#gadm36_IND_3">
        <SimpleData name="NAME_0">val1</SimpleData>
        <SimpleData name="NAME_1">val2</SimpleData>
        <SimpleData name="NAME_2">val3</SimpleData>
    </SchemaData></ExtendedData>
      <MultiGeometry><Polygon><outerBoundaryIs><LinearRing><coordinates>92.7877807617189,9.24416637420654 92.7888870239258,9.24305438995361 92.7897186279296,9.24306106567383 92.7902832031251,9.24250030517589 92.7905578613282,9.24250030517589 92.7911148071289,9.24194431304943 92.7913894653321,9.24194431304943 92.7922210693359,9.24110984802257 92.7922210693359,9.24083423614508 92.7930526733399,9.23999977111822
      </coordinates></LinearRing>...
      <Placemark>
    <Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
    <ExtendedData><SchemaData schemaUrl="#gadm36_IND_3">
        <SimpleData name="NAME_0">val1</SimpleData>
        <SimpleData name="NAME_1">val2</SimpleData>
        <SimpleData name="NAME_2">val3</SimpleData>
    </SchemaData></ExtendedData>
      <MultiGeometry><Polygon><outerBoundaryIs><LinearRing><coordinates>92.7877807617189,9.24416637420654 92.7888870239258,9.24305438995361 92.7897186279296,9.24306106567383 92.7902832031251,9.24250030517589 92.7905578613282,9.24250030517589 92.7911148071289,9.24194431304943 92.7913894653321,9.24194431304943 92.7922210693359,9.24110984802257 92.7922210693359,9.24083423614508 92.7930526733399,9.23999977111822
      </coordinates></LinearRing>...

This is what Im stuck on:

import xml.etree.ElementTree as ET
tree = ET.parse('')
root = tree.getroot()
for val in root.findall('.//{http://www.opengis.net/kml/2.2}SimpleData[@text=""]//coordinates'):
    print val.text

Use lxml with XPath and namespaces . In the XPath selector you can navigate from the SimpleData with text val3 back to the Placemark ancestor and from there to the coordinates .

from lxml import etree

tree = etree.parse("so.xml")
nsmap = {"kml": "http://www.opengis.net/kml/2.2"}

listOfCoordinates = tree.xpath("//kml:SimpleData[text()=\"val3\"]/ancestor::kml:Placemark//kml:coordinates", namespaces=nsmap)
print(listOfCoordinates[0].text)

Output:

92.7877807617189,9.24416637420654 92.7888870239258,9.24305438995361 92.7897186279296,9.24306106567383 92.7902832031251,9.24250030517589 92.7905578613282,9.24250030517589 92.7911148071289,9.24194431304943 92.79138
94653321,9.24194431304943 92.7922210693359,9.24110984802257 92.7922210693359,9.24083423614508 92.7930526733399,9.23999977111822

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM