简体   繁体   中英

Parsing XML from URL and getting info from tag

I try to get info from the tag "CRS" from an XML using python and collect all the CRS tags, or I will later edit the code to check for specific CRS.

Url: https://wms.geonorge.no/skwms1/wms.adm_enheter_historisk?service=WMS&request=GetCapabilities

I can get the data, but I can't figure out how to get the info from the correct tag.

This is my code so far:

import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET

url = 'https://wms.geonorge.no/skwms1/wms.adm_enheter_historisk?service=WMS&request=GetCapabilities'
uh = urllib.request.urlopen(url)
data = uh.read()

tree = ET.fromstring(data)

From here I'm not sure how to proceed with the tree.find() or tree.findall()

Thanks.

So, here is what I did. I needed to check if the the XML contained the CRS EPSG:3857. So instead of getting all the info in the CRS tag, I worked around the problem with testing if the parsed XML contained the text "EPSG:3857".

import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET

url = 'https://wms.geonorge.no/skwms1/wms.adm_enheter_historisk?service=WMS&request=GetCapabilities'
uh = urllib.request.urlopen(url)
data = uh.read()

projection = "EPSG:3857"

if projection in str(data):
    print("Contains")
else:
    print("Contains not")

Now implementing in another program to search trough multiple XML files.

Try this.

from simplified_scrapy import req, SimplifiedDoc
xml = req.get(
    'https://wms.geonorge.no/skwms1/wms.adm_enheter_historisk?service=WMS&request=GetCapabilities'
)
doc = SimplifiedDoc(xml)
listCRS = doc.selects('CRS')
print(listCRS)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM