簡體   English   中英

如何在python中解析xml?

[英]How to parse xml in python?

我必須從XML文檔中提取friendlyName

這是我當前的解決方案:

root = ElementTree.fromstring(urllib2.urlopen(XMLLocation).read())        
for child in root.iter('{urn:schemas-upnp-org:device-1-0}friendlyName'):
    return child.text

我有什么更好的方法可以做到這一點(也許其他不涉及迭代的方法)? 我可以使用XPath嗎?


XML內容:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="urn:schemas-upnp-org:device-1-0">
   <specVersion>
      <major>1</major>
      <minor>0</minor>
   </specVersion>
   <device>
      <dlna:X_DLNADOC xmlns:dlna="urn:schemas-dlna-org:device-1-0">DMR-1.50</dlna:X_DLNADOC>
      <deviceType>urn:schemas-upnp-org:device:MediaRenderer:1</deviceType>
      <friendlyName>My Product 912496</friendlyName>
      <manufacturer>embedded</manufacturer>
      <manufacturerURL>http://www.embedded.com</manufacturerURL>
      <modelDescription>Product</modelDescription>
      <modelName>Product</modelName>
      <modelNumber />
      <modelURL>http://www.embedded.com</modelURL>
      <UDN>uuid:93b2abac-cb6a-4857-b891-002261912496</UDN>
      <serviceList>
         <service>
            <serviceType>urn:schemas-upnp-org:service:ConnectionManager:1</serviceType>
            <serviceId>urn:upnp-org:serviceId:ConnectionManager</serviceId>
            <SCPDURL>/xml/ConnectionManager.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelSinkConnectionManager</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelSinkConnectionManager</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-upnp-org:service:AVTransport:1</serviceType>
            <serviceId>urn:upnp-org:serviceId:AVTransport</serviceId>
            <SCPDURL>/xml/AVTransport2.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelAVTransport</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelAVTransport</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-upnp-org:service:RenderingControl:3</serviceType>
            <serviceId>urn:upnp-org:serviceId:RenderingControl</serviceId>
            <SCPDURL>/xml/RenderingControl2.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelRenderingControl</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelRenderingControl</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-embedded-com:service:RTSPGateway:1</serviceType>
            <serviceId>urn:embedded-com:serviceId:RTSPGateway</serviceId>
            <SCPDURL>/xml/RTSPGateway.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelRTSPGateway</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelRTSPGateway</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-embedded-com:service:SpeakerManagement:1</serviceType>
            <serviceId>urn:embedded-com:serviceId:SpeakerManagement</serviceId>
            <SCPDURL>/xml/SpeakerManagement.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelSpeakerManagement</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelSpeakerManagement</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-embedded-com:service:NetworkManagement:1</serviceType>
            <serviceId>urn:embedded-com:serviceId:NetworkManagement</serviceId>
            <SCPDURL>/xml/NetworkManagement.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelNetworkManagement</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelNetworkManagement</controlURL>
         </service>
      </serviceList>
      <iconList>
         <icon>
            <mimetype>image/png</mimetype>
            <width>120</width>
            <height>120</height>
            <depth>32</depth>
            <url>/org.mpris.MediaPlayer2.mansion-120x120x32.png</url>
         </icon>
         <icon>
            <mimetype>image/png</mimetype>
            <width>48</width>
            <height>48</height>
            <depth>32</depth>
            <url>/org.mpris.MediaPlayer2.mansion-48x48x32.png</url>
         </icon>
         <icon>
            <mimetype>image/jpeg</mimetype>
            <width>120</width>
            <height>120</height>
            <depth>24</depth>
            <url>/org.mpris.MediaPlayer2.mansion-120x120x24.jpg</url>
         </icon>
         <icon>
            <mimetype>image/jpeg</mimetype>
            <width>48</width>
            <height>48</height>
            <depth>24</depth>
            <url>/org.mpris.MediaPlayer2.mansion-48x48x24.jpg</url>
         </icon>
      </iconList>
      <X_embeddedDevice xmlns:edd="schemas-embedded-com:extended-device-description">
         <firmwareVersion>v1.0 (4.155.1.15.002)</firmwareVersion>
         <features>
            <feature>
               <name>com.sony.Product</name>
               <version>1.0.0</version>
            </feature>
            <feature>
               <name>com.sony.Product.btmrc</name>
               <version>1.0.0</version>
            </feature>
            <feature>
               <name>com.sony.Product.btmrs</name>
               <version>1.0.0</version>
            </feature>
         </features>
      </X_embeddedDevice>
   </device>
</root>

佩德羅(Pedro),在評論中是對的。

.find(match, namespaces=None)

查找第一個子元素匹配的匹配項。 匹配可以是標簽名稱或路徑。 返回元素實例或無。 名稱空間是從名稱空間前綴到全名的可選映射。

在這些情況下,ElemntTree文檔確實很有幫助。 https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.find

編輯:我在評論中給出的鏈接導致以下代碼:

import xml.etree.ElementTree as ET
input = '''<stuff>
<users>
<user x="2">
<id>001</id>
<name>Chuck</name>
</user>
<user x="7">
<id>009</id>
<name>Brent</name>
</user>
</users>
</stuff>
'''
stuff = ET.fromstring(input)
lst = stuff.findall("users/user")
print len(lst)
for item in lst:
print item.attrib["x"]
item = lst[0]
ET.dump(item)
item.get("x")   # get works on attributes
item.find("id").text
item.find("id").tag
for user in stuff.getiterator('user') :
print "User" , user.attrib["x"]
ET.dump(user)

上面的代碼使用:

item.find("id").text

如果您對此進行了修改,同時刪除了不需要的其他代碼,則發現應類似於以下內容:

item.find('device/friendlyName').text

您可以獲取xml文件,而不是將輸入字符串與以下內容一起使用(來自ElementTree文檔):

import xml.etree.ElementTree as ET
tree = ET.parse('your_file_name.xml')

使用ElementTree,您可以直接從文件中讀取或將其加載到字符串中。

首先,包括以下導入。

from xml.etree.ElementTree import ElementTree
from xml.parsers.expat import ExpatError

如果使用字符串:

from xml.etree.ElementTree import fromstring
try:
tree = fromstring(xml_data)
except ExpatData:
print "Unable to parse XML data from string"

否則,直接加載它:

try:
tree = ElementTree(file = "filename")
except ExpatData:
print "Unable to parse XML from file"

初始化樹后,就可以開始解析信息。

root = tree.getroot()
print root.find('device/friendlyName').text
import xml.etree.ElementTree as ElementTree

namespace = '{urn:schemas-upnp-org:device-1-0}'
root = ElementTree.fromstring(urllib2.urlopen(XMLLocation).read())

# The `//` specifies all subelements within the whole tree.
return root.find('.//{}friendlyName'.format(namespace)).text

find()函數在找到第一個匹配項時停止。 要獲取所有與XPath匹配的元素,請使用findall()函數。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM