简体   繁体   中英

How to read kml file with specific identifier in python?

I am trying to read these kml-files provided by german weather service: example_data

With the following code I am not able to get access to the dwd: children:

from zipfile import ZipFile
from lxml import html
from urllib.request import urlretrieve

urlretrieve('http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10641/kml/MOSMIX_L_LATEST_10641.kmz')

kmz = ZipFile("local_data.kmz", 'r')
kml = kmz.open(kmz.filelist[0].filename, 'r').read()

root = parser.fromstring(kml)

With the root.Document.Placemark.ExtendedData.getchildren() command I am able to access the following list (length is 114, i cut it here):

[<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71705b2b08>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706befc8>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bef88>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bebc8>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bea88>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706beb08>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706be848>,
 <Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706be988>]

But with root.Document.Placemark.ExtendedData.Foreast I get the following error message:

AttributeError: no such child: {http://www.opengis.net/kml/2.2}Forecast

I guess the problem is that the standard opengis kml Schema is used. How can I access the data?

This is the head of a file:

<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<kml:kml xmlns:dwd="https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:xal="urn:oasis:names:tc:ciq:xsdschema:xAL:2.0" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
    <kml:Document>
        <kml:ExtendedData>
            <dwd:ProductDefinition>
                <dwd:Issuer>Deutscher Wetterdienst</dwd:Issuer>
                <dwd:ProductID>DWD_MOSMIX_1H</dwd:ProductID>
                <dwd:GeneratingProcess>DWD MOSMIX hourly, Version 1.0</dwd:GeneratingProcess>
                <dwd:IssueTime></dwd:IssueTime>
                <dwd:ReferencedModel>
                    <dwd:Model dwd:name="ICON" dwd:referenceTime="2018-05-17T00:00:00Z"/>
                    <dwd:Model dwd:name="ECMWF/IFS" dwd:referenceTime="2018-05-17T00:00:00Z"/>
                </dwd:ReferencedModel>
                <dwd:ForecastTimeSteps>
                    <dwd:TimeStep>2018-05-17T10:00:00.000Z</dwd:TimeStep>
                    <dwd:TimeStep>2018-05-17T11:00:00.000Z</dwd:TimeStep>
                    <dwd:TimeStep>2018-05-17T12:00:00.000Z</dwd:TimeStep>
                    <dwd:TimeStep>2018-05-17T13:00:00.000Z</dwd:TimeStep>
                    <dwd:TimeStep>2018-05-17T14:00:00.000Z</dwd:TimeStep>
                    <dwd:TimeStep>2018-05-17T15:00:00.000Z</dwd:TimeStep>

To parse custom elements out of a KML/KMZ file BeautifulSoup Python library is probably the simplest option.

from zipfile import ZipFile
import requests
from bs4 import BeautifulSoup

url = 'http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10641/kml/MOSMIX_L_LATEST_10641.kmz'
r = requests.get(url)
with open("local_data.kmz", "wb") as fout:
    fout.write(r.content)

with ZipFile("local_data.kmz", 'r') as kmz:
    kml = kmz.open(kmz.filelist[0].filename, 'r').read()
soup = BeautifulSoup(kml, 'xml')

# iterate over each TimeStep element in dwd:ForecastTimeSteps
steps = soup.find("dwd:ForecastTimeSteps")
for step in steps.find_all("dwd:TimeStep"):
    print(step.text)

Output:

2021-10-20T16:00:00.000Z
2021-10-20T17:00:00.000Z
2021-10-20T18:00:00.000Z
...
2021-10-30T20:00:00.000Z
2021-10-30T21:00:00.000Z
2021-10-30T22:00:00.000Z

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM