简体   繁体   English

解析XML属性名称

[英]Parsing XML Attribute Names

Python, currently using 2.7 but can easily change to latest and greatest. Python,目前使用2.7,但可以轻松更改为最新版本。

Needing to parse this XML and return the INT value contained within the item. 需要解析此XML并返回项目中包含的INT值。 This isn't my XML. 这不是我的XML。 This is coming from a piece of enterprise level software. 这来自一个企业级软件。

<counters>
<item name="stats/counters/session/responsetime" type="int">1047</item>
<item name="stats/counters/session/responsecount" type="int">7423</item>
<item name="stats/counters/init/inittime" type="int">36339</item>
<item name="stats/counters/init/fetchtime" type="int">8097</item>
<item name="stats/connectionsetups" type="int">579</item>
<item name="stats/activesessions" type="int">4294967289</item>
<item name="stats/activeconnections" type="int">0</item>
</counters>

Code: 码:

import xml.etree.ElementTree as ET
import xml

def _getstats():
 resp = requests.get(urlStats)

 #Writing XML to disk. This makes parsing it MUCH easier.
 with open('stats_10.xml', 'wb') as f:
    f.write(resp.content)
    f.close()

tree = ET.parse('stats_10.xml')
root = tree.getroot()

active = root.find('stats/activesessions')

print active

The return is always None. 返回值始终为“无”。 I'm Using ElementTree. 我正在使用ElementTree。 Read through the documentation ( https://docs.python.org/3.0/library/xml.etree.elementtree.html ) and many StackOF pages. 通读文档( https://docs.python.org/3.0/library/xml.etree.elementtree.html )和许多StackOF页面。

I think the problem is that the parser doesn't understand the slash. 我认为问题在于解析器无法理解斜线。

Attempted to pull by name using "active = int(root['stats/activesessions'])" in place of root find which returns this error: 尝试使用“ active = int(root ['stats / activesessions'])”代替根查找来按名称提取,该错误返回以下错误:

TypeError: list indices must be integers, not str

Also tried xmltodict but that was even worse that using ElementTree. 还尝试了xmltodict,但这比使用ElementTree更糟糕。 The error would always be 'list indices must be integers'. 该错误始终是“列表索引必须为整数”。

Lastly, this is a dynamic XML document. 最后,这是一个动态XML文档。 Indexing by ROW is not an option because at idle, the software returns 10 rows for example and under a load it return 15, with additional rows being mixed with the other rows. 不能选择用ROW进行索引,因为在空闲状态下,该软件例如返回10行,而在负载下则返回15行,其他行与其他行混合。 I have to pull by child name. 我必须按孩子的名字。

Thank you in advance for any assistance! 预先感谢您的协助!

ADDITION: 加成:

I can run an iteration through the XML and pull the value. 我可以通过XML运行迭代并提取值。 However, as stated above, the XML will change and the number of rows will increase, thus throwing my indices off. 但是,如上所述,XML将更改,行数将增加,从而使我的索引不可用。

active = root[5].text
print active

I believe the find method is looking for a tag name, not an attribute value. 我相信find方法正在寻找标签名称,而不是属性值。 You need to find the item tag, check if it has a name attribute, and then check if the attribute equals "stats/activesessions". 您需要找到item标记,检查它是否具有名称属性,然后检查该属性是否等于“ stats / activesessions”。 If this condition is met, you can read in the value of the item tag. 如果满足此条件,则可以读取item标记的值。

This is obviously me not understanding XML and how it's structured. 显然,这是我不了解XML及其结构的原因。 Added this in my code and I get the return value I'm looking for. 在我的代码中添加了这个,我得到了我想要的返回值。

for item in root.findall("./item[@name='system/starttime']"):
starttime = int(item.text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM