[英]LXML, how to get multiple set of attributes to lists
我有類似的問題:
我的XML數據如下所示:
<?xml version="1.0" encoding="utf-8"?>
<Basic>
<Segment>
<Sample value="12" data2="25" data3="23"/>
<Sample value="13" data2="0" data3="323"/>
<Sample value="14" data2="2" data3="3"/>
</Segment>
</Basic>
什么是將這些datax
值提供給列表的最簡單的python方法。
例如: data2 = ['25','0','2']
使用xpath:
from lxml import etree
from collections import defaultdict
from pprint import pprint
doc="""<?xml version="1.0" encoding="utf-8"?>
<Basic>
<Segment>
<Sample value="12" data2="25" data3="23"/>
<Sample value="13" data2="0" data3="323"/>
<Sample value="14" data2="2" data3="3"/>
</Segment>
</Basic>
"""
el = etree.fromstring(doc)
data2 = el.xpath('//@data2')
dataX = el.xpath('//@*[starts-with(name(), "data")]')
print data2
print dataX
# With iteration over Sample elements, like in J.F. Sebastian answer, but with XPath
d = defaultdict(list)
for sample in el.xpath('//Sample'):
for attr_name, attr_value in sample.items():
d[attr_name].append(attr_value)
pprint(dict(d))
輸出:
['25', '0', '2']
['25', '23', '0', '323', '2', '3']
{'data2': ['25', '0', '2'],
'data3': ['23', '323', '3'],
'value': ['12', '13', '14']}
獲取屬性值的最簡單方法是使用etree.Element.get (' attr_name '):
from lxml import etree
s = '''<?xml version="1.0" encoding="utf-8"?>
<Basic>
<Segment>
<Sample value="12" data2="25" data3="23"/>
<Sample value="13" data2="0" data3="323"/>
<Sample value="14" data2="2" data3="3"/>
</Segment>
</Basic>'''
# ❗️for python2
# tree = etree.fromstring(s)
# ❗️for python3
tree = etree.fromstring(s.encode("utf-8"))
samples = tree.xpath('//Sample')
print([sample.get('data2') for sample in samples])
>>> ['25', '0', '2']
使用stdlib中的cElementTree
:
import sys
from collections import defaultdict
from xml.etree import cElementTree as etree
d = defaultdict(list)
for ev, el in etree.iterparse(sys.stdin):
if el.tag == 'Sample':
for name in "value data2 data3".split():
d[name].append(el.get(name))
print(d)
{'data2': ['25', '0', '2'],
'data3': ['23', '323', '3'],
'value': ['12', '13', '14']}
如果您使用lxml.etree
那么您可以: etree.iterparse(file, tag='Sample')
在iterparse()
選擇Sample
元素,即, if el.tag == 'Sample'
在這種情況下if el.tag == 'Sample'
條件, if el.tag == 'Sample'
可以刪除。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.