简体   繁体   English

如何在python中解析xml?

[英]How to parse xml in python?

I have to extract friendlyName from the XML document. 我必须从XML文档中提取friendlyName

Here's my current solution: 这是我当前的解决方案:

root = ElementTree.fromstring(urllib2.urlopen(XMLLocation).read())        
for child in root.iter('{urn:schemas-upnp-org:device-1-0}friendlyName'):
    return child.text

I there any better way to do this (maybe any other way which does not involve iteration)? 我有什么更好的方法可以做到这一点(也许其他不涉及迭代的方法)? Could I use XPath? 我可以使用XPath吗?


XML content: XML内容:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="urn:schemas-upnp-org:device-1-0">
   <specVersion>
      <major>1</major>
      <minor>0</minor>
   </specVersion>
   <device>
      <dlna:X_DLNADOC xmlns:dlna="urn:schemas-dlna-org:device-1-0">DMR-1.50</dlna:X_DLNADOC>
      <deviceType>urn:schemas-upnp-org:device:MediaRenderer:1</deviceType>
      <friendlyName>My Product 912496</friendlyName>
      <manufacturer>embedded</manufacturer>
      <manufacturerURL>http://www.embedded.com</manufacturerURL>
      <modelDescription>Product</modelDescription>
      <modelName>Product</modelName>
      <modelNumber />
      <modelURL>http://www.embedded.com</modelURL>
      <UDN>uuid:93b2abac-cb6a-4857-b891-002261912496</UDN>
      <serviceList>
         <service>
            <serviceType>urn:schemas-upnp-org:service:ConnectionManager:1</serviceType>
            <serviceId>urn:upnp-org:serviceId:ConnectionManager</serviceId>
            <SCPDURL>/xml/ConnectionManager.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelSinkConnectionManager</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelSinkConnectionManager</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-upnp-org:service:AVTransport:1</serviceType>
            <serviceId>urn:upnp-org:serviceId:AVTransport</serviceId>
            <SCPDURL>/xml/AVTransport2.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelAVTransport</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelAVTransport</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-upnp-org:service:RenderingControl:3</serviceType>
            <serviceId>urn:upnp-org:serviceId:RenderingControl</serviceId>
            <SCPDURL>/xml/RenderingControl2.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelRenderingControl</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelRenderingControl</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-embedded-com:service:RTSPGateway:1</serviceType>
            <serviceId>urn:embedded-com:serviceId:RTSPGateway</serviceId>
            <SCPDURL>/xml/RTSPGateway.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelRTSPGateway</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelRTSPGateway</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-embedded-com:service:SpeakerManagement:1</serviceType>
            <serviceId>urn:embedded-com:serviceId:SpeakerManagement</serviceId>
            <SCPDURL>/xml/SpeakerManagement.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelSpeakerManagement</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelSpeakerManagement</controlURL>
         </service>
         <service>
            <serviceType>urn:schemas-embedded-com:service:NetworkManagement:1</serviceType>
            <serviceId>urn:embedded-com:serviceId:NetworkManagement</serviceId>
            <SCPDURL>/xml/NetworkManagement.xml</SCPDURL>
            <eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelNetworkManagement</eventSubURL>
            <controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelNetworkManagement</controlURL>
         </service>
      </serviceList>
      <iconList>
         <icon>
            <mimetype>image/png</mimetype>
            <width>120</width>
            <height>120</height>
            <depth>32</depth>
            <url>/org.mpris.MediaPlayer2.mansion-120x120x32.png</url>
         </icon>
         <icon>
            <mimetype>image/png</mimetype>
            <width>48</width>
            <height>48</height>
            <depth>32</depth>
            <url>/org.mpris.MediaPlayer2.mansion-48x48x32.png</url>
         </icon>
         <icon>
            <mimetype>image/jpeg</mimetype>
            <width>120</width>
            <height>120</height>
            <depth>24</depth>
            <url>/org.mpris.MediaPlayer2.mansion-120x120x24.jpg</url>
         </icon>
         <icon>
            <mimetype>image/jpeg</mimetype>
            <width>48</width>
            <height>48</height>
            <depth>24</depth>
            <url>/org.mpris.MediaPlayer2.mansion-48x48x24.jpg</url>
         </icon>
      </iconList>
      <X_embeddedDevice xmlns:edd="schemas-embedded-com:extended-device-description">
         <firmwareVersion>v1.0 (4.155.1.15.002)</firmwareVersion>
         <features>
            <feature>
               <name>com.sony.Product</name>
               <version>1.0.0</version>
            </feature>
            <feature>
               <name>com.sony.Product.btmrc</name>
               <version>1.0.0</version>
            </feature>
            <feature>
               <name>com.sony.Product.btmrs</name>
               <version>1.0.0</version>
            </feature>
         </features>
      </X_embeddedDevice>
   </device>
</root>

Pedro, in the comments is right. 佩德罗(Pedro),在评论中是对的。

.find(match, namespaces=None)

Finds the first subelement matching match. 查找第一个子元素匹配的匹配项。 match may be a tag name or a path. 匹配可以是标签名称或路径。 Returns an element instance or None. 返回元素实例或无。 namespaces is an optional mapping from namespace prefix to full name. 名称空间是从名称空间前缀到全名的可选映射。

The ElemntTree docs are really helpful in these cases. 在这些情况下,ElemntTree文档确实很有帮助。 https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.find https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.find

Edit: The link I gave in the comments leads to the following code: 编辑:我在评论中给出的链接导致以下代码:

import xml.etree.ElementTree as ET
input = '''<stuff>
<users>
<user x="2">
<id>001</id>
<name>Chuck</name>
</user>
<user x="7">
<id>009</id>
<name>Brent</name>
</user>
</users>
</stuff>
'''
stuff = ET.fromstring(input)
lst = stuff.findall("users/user")
print len(lst)
for item in lst:
print item.attrib["x"]
item = lst[0]
ET.dump(item)
item.get("x")   # get works on attributes
item.find("id").text
item.find("id").tag
for user in stuff.getiterator('user') :
print "User" , user.attrib["x"]
ET.dump(user)

The code above uses: 上面的代码使用:

item.find("id").text

If you modify that, along with removing the other code which you don't need... The find should look something like this: 如果您对此进行了修改,同时删除了不需要的其他代码,则发现应类似于以下内容:

item.find('device/friendlyName').text

You can get the xml file, instead of using the input string with the following (from the ElementTree docs): 您可以获取xml文件,而不是将输入字符串与以下内容一起使用(来自ElementTree文档):

import xml.etree.ElementTree as ET
tree = ET.parse('your_file_name.xml')

Using ElementTree, you can either read directly from the file or load it into a string. 使用ElementTree,您可以直接从文件中读取或将其加载到字符串中。

First , include the following import. 首先,包括以下导入。

from xml.etree.ElementTree import ElementTree
from xml.parsers.expat import ExpatError

If you are using a string: 如果使用字符串:

from xml.etree.ElementTree import fromstring
try:
tree = fromstring(xml_data)
except ExpatData:
print "Unable to parse XML data from string"

Otherwise, to load it directly: 否则,直接加载它:

try:
tree = ElementTree(file = "filename")
except ExpatData:
print "Unable to parse XML from file"

Once you have the tree initialised, you can begin parsing the information. 初始化树后,就可以开始解析信息。

root = tree.getroot()
print root.find('device/friendlyName').text
import xml.etree.ElementTree as ElementTree

namespace = '{urn:schemas-upnp-org:device-1-0}'
root = ElementTree.fromstring(urllib2.urlopen(XMLLocation).read())

# The `//` specifies all subelements within the whole tree.
return root.find('.//{}friendlyName'.format(namespace)).text

The find() function stops when it finds the first match. find()函数在找到第一个匹配项时停止。 To get all of the elements that match the XPath, use the findall() function. 要获取所有与XPath匹配的元素,请使用findall()函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM