[英]How to parse xml in python?
I have to extract friendlyName
from the XML document. 我必须从XML文档中提取friendlyName
。
Here's my current solution: 这是我当前的解决方案:
root = ElementTree.fromstring(urllib2.urlopen(XMLLocation).read())
for child in root.iter('{urn:schemas-upnp-org:device-1-0}friendlyName'):
return child.text
I there any better way to do this (maybe any other way which does not involve iteration)? 我有什么更好的方法可以做到这一点(也许其他不涉及迭代的方法)? Could I use XPath? 我可以使用XPath吗?
XML content: XML内容:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="urn:schemas-upnp-org:device-1-0">
<specVersion>
<major>1</major>
<minor>0</minor>
</specVersion>
<device>
<dlna:X_DLNADOC xmlns:dlna="urn:schemas-dlna-org:device-1-0">DMR-1.50</dlna:X_DLNADOC>
<deviceType>urn:schemas-upnp-org:device:MediaRenderer:1</deviceType>
<friendlyName>My Product 912496</friendlyName>
<manufacturer>embedded</manufacturer>
<manufacturerURL>http://www.embedded.com</manufacturerURL>
<modelDescription>Product</modelDescription>
<modelName>Product</modelName>
<modelNumber />
<modelURL>http://www.embedded.com</modelURL>
<UDN>uuid:93b2abac-cb6a-4857-b891-002261912496</UDN>
<serviceList>
<service>
<serviceType>urn:schemas-upnp-org:service:ConnectionManager:1</serviceType>
<serviceId>urn:upnp-org:serviceId:ConnectionManager</serviceId>
<SCPDURL>/xml/ConnectionManager.xml</SCPDURL>
<eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelSinkConnectionManager</eventSubURL>
<controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelSinkConnectionManager</controlURL>
</service>
<service>
<serviceType>urn:schemas-upnp-org:service:AVTransport:1</serviceType>
<serviceId>urn:upnp-org:serviceId:AVTransport</serviceId>
<SCPDURL>/xml/AVTransport2.xml</SCPDURL>
<eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelAVTransport</eventSubURL>
<controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelAVTransport</controlURL>
</service>
<service>
<serviceType>urn:schemas-upnp-org:service:RenderingControl:3</serviceType>
<serviceId>urn:upnp-org:serviceId:RenderingControl</serviceId>
<SCPDURL>/xml/RenderingControl2.xml</SCPDURL>
<eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelRenderingControl</eventSubURL>
<controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelRenderingControl</controlURL>
</service>
<service>
<serviceType>urn:schemas-embedded-com:service:RTSPGateway:1</serviceType>
<serviceId>urn:embedded-com:serviceId:RTSPGateway</serviceId>
<SCPDURL>/xml/RTSPGateway.xml</SCPDURL>
<eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelRTSPGateway</eventSubURL>
<controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelRTSPGateway</controlURL>
</service>
<service>
<serviceType>urn:schemas-embedded-com:service:SpeakerManagement:1</serviceType>
<serviceId>urn:embedded-com:serviceId:SpeakerManagement</serviceId>
<SCPDURL>/xml/SpeakerManagement.xml</SCPDURL>
<eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelSpeakerManagement</eventSubURL>
<controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelSpeakerManagement</controlURL>
</service>
<service>
<serviceType>urn:schemas-embedded-com:service:NetworkManagement:1</serviceType>
<serviceId>urn:embedded-com:serviceId:NetworkManagement</serviceId>
<SCPDURL>/xml/NetworkManagement.xml</SCPDURL>
<eventSubURL>/Event/org.mpris.MediaPlayer2.mansion/RygelNetworkManagement</eventSubURL>
<controlURL>/Control/org.mpris.MediaPlayer2.mansion/RygelNetworkManagement</controlURL>
</service>
</serviceList>
<iconList>
<icon>
<mimetype>image/png</mimetype>
<width>120</width>
<height>120</height>
<depth>32</depth>
<url>/org.mpris.MediaPlayer2.mansion-120x120x32.png</url>
</icon>
<icon>
<mimetype>image/png</mimetype>
<width>48</width>
<height>48</height>
<depth>32</depth>
<url>/org.mpris.MediaPlayer2.mansion-48x48x32.png</url>
</icon>
<icon>
<mimetype>image/jpeg</mimetype>
<width>120</width>
<height>120</height>
<depth>24</depth>
<url>/org.mpris.MediaPlayer2.mansion-120x120x24.jpg</url>
</icon>
<icon>
<mimetype>image/jpeg</mimetype>
<width>48</width>
<height>48</height>
<depth>24</depth>
<url>/org.mpris.MediaPlayer2.mansion-48x48x24.jpg</url>
</icon>
</iconList>
<X_embeddedDevice xmlns:edd="schemas-embedded-com:extended-device-description">
<firmwareVersion>v1.0 (4.155.1.15.002)</firmwareVersion>
<features>
<feature>
<name>com.sony.Product</name>
<version>1.0.0</version>
</feature>
<feature>
<name>com.sony.Product.btmrc</name>
<version>1.0.0</version>
</feature>
<feature>
<name>com.sony.Product.btmrs</name>
<version>1.0.0</version>
</feature>
</features>
</X_embeddedDevice>
</device>
</root>
Pedro, in the comments is right. 佩德罗(Pedro),在评论中是对的。
.find(match, namespaces=None)
Finds the first subelement matching match. 查找第一个子元素匹配的匹配项。 match may be a tag name or a path. 匹配可以是标签名称或路径。 Returns an element instance or None. 返回元素实例或无。 namespaces is an optional mapping from namespace prefix to full name. 名称空间是从名称空间前缀到全名的可选映射。
The ElemntTree docs are really helpful in these cases. 在这些情况下,ElemntTree文档确实很有帮助。 https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.find https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.find
Edit: The link I gave in the comments leads to the following code: 编辑:我在评论中给出的链接导致以下代码:
import xml.etree.ElementTree as ET
input = '''<stuff>
<users>
<user x="2">
<id>001</id>
<name>Chuck</name>
</user>
<user x="7">
<id>009</id>
<name>Brent</name>
</user>
</users>
</stuff>
'''
stuff = ET.fromstring(input)
lst = stuff.findall("users/user")
print len(lst)
for item in lst:
print item.attrib["x"]
item = lst[0]
ET.dump(item)
item.get("x") # get works on attributes
item.find("id").text
item.find("id").tag
for user in stuff.getiterator('user') :
print "User" , user.attrib["x"]
ET.dump(user)
The code above uses: 上面的代码使用:
item.find("id").text
If you modify that, along with removing the other code which you don't need... The find should look something like this: 如果您对此进行了修改,同时删除了不需要的其他代码,则发现应类似于以下内容:
item.find('device/friendlyName').text
You can get the xml file, instead of using the input string with the following (from the ElementTree docs): 您可以获取xml文件,而不是将输入字符串与以下内容一起使用(来自ElementTree文档):
import xml.etree.ElementTree as ET
tree = ET.parse('your_file_name.xml')
Using ElementTree, you can either read directly from the file or load it into a string. 使用ElementTree,您可以直接从文件中读取或将其加载到字符串中。
First , include the following import. 首先,包括以下导入。
from xml.etree.ElementTree import ElementTree
from xml.parsers.expat import ExpatError
If you are using a string: 如果使用字符串:
from xml.etree.ElementTree import fromstring
try:
tree = fromstring(xml_data)
except ExpatData:
print "Unable to parse XML data from string"
Otherwise, to load it directly: 否则,直接加载它:
try:
tree = ElementTree(file = "filename")
except ExpatData:
print "Unable to parse XML from file"
Once you have the tree initialised, you can begin parsing the information. 初始化树后,就可以开始解析信息。
root = tree.getroot()
print root.find('device/friendlyName').text
import xml.etree.ElementTree as ElementTree
namespace = '{urn:schemas-upnp-org:device-1-0}'
root = ElementTree.fromstring(urllib2.urlopen(XMLLocation).read())
# The `//` specifies all subelements within the whole tree.
return root.find('.//{}friendlyName'.format(namespace)).text
The find() function stops when it finds the first match. find()函数在找到第一个匹配项时停止。 To get all of the elements that match the XPath, use the findall() function. 要获取所有与XPath匹配的元素,请使用findall()函数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.