[英]How to parse tree using lxml iterparse
This is part of my xml starting from some part:这是我的 xml 的一部分,从某部分开始:
<bigchapter>
<chapter id="a" name="x">
<valueimportant v="valuetoget1"/>
<TimeSeries>
<TimeSeriesIdentification v="1"/>
<type v="a1"/>
<Period>
<Interval>
<Pos v="1"/>
<Qty v="26"/>
</Interval>
<Interval>
<Pos v="2"/>
<Qty v="26"/>
</Interval>
</Period>
</TimeSeries>
<TimeSeries>
<type v="b1"/>
<Period>
<Interval>
<Pos v="1"/>
<Qty v="26"/>
</Interval>
<Interval>
<Pos v="2"/>
<Qty v="26"/>
</Interval>
</Period>
</TimeSeries>
</chapter>
<chapter id="a" name="x">
<valueimportant v="valuetoget2"/>
<TimeSeries>
<TimeSeriesIdentification v="1"/>
<type v="a1"/>
<Period>
<Interval>
<Pos v="1"/>
<Qty v="154"/>
</Interval>
<Interval>
<Pos v="2"/>
<Qty v="126"/>
</Interval>
</Period>
</TimeSeries>
<TimeSeries>
<type v="b1"/>
<Period>
<Interval>
<Pos v="1"/>
<Qty v="137"/>
</Interval>
<Interval>
<Pos v="2"/>
<Qty v="148"/>
</Interval>
</Period>
</TimeSeries>
</chapter>
</bigchapter>
What I want is to create a dictionary with valueimportant as a key and as a value another dictionary with types as keys and dictionary with keys as Pos and Qty as values.我想要的是创建一个以 valueimportant 作为键和值的字典,另一个以类型为键的字典和以键为 Pos 和 Qty 为值的字典。
In return I will be getting:作为回报,我将得到:
{valuetoget1: {a1:{1: 26, 2:26}, b1: {1:26, 2:26}}, valuetoget2: {a1:{1:154, 2:126}, b1:{1:137,2:148}}
I also have some xml before this part of xml, which is irrelevant, I tried this way I am getting the first part of my dictionary, which is keys, but I do not know how to proceed I would be grateful to use lxml etree我在 xml 的这一部分之前也有一些 xml,这是无关紧要的,我试过这种方式我得到了我的字典的第一部分,这是键,但我不知道如何继续我将不胜感激使用 lxml etree
result={}
context = etree.iterparse(file_obj,
events=("end",))
for event, elem in context:
try:
if elem.tag == 'chapter':
valueimportant = elem.find('valueimportant')
if valueimportant.attrib['v'] not in result.keys():
result[valueimportant.attrib['v']] = {}
except IndexError or KeyError or ValueError:
print('error')
I'm not sure that you need lxml
there, built-in ElementTree
functionality should be enough.我不确定你是否需要
lxml
,内置的ElementTree
功能应该足够了。 The main task is to collect data, so just iterate over root node processing each <chapter>
separately, find <valueimportant>
node with v
attribute, then iterate over <Period>
node and find <Pos>
and <Qty>
nodes with v
attributes.主要任务是收集数据,所以遍历根节点分别处理每个
<chapter>
,找到具有v
属性的<valueimportant>
节点,然后遍历<Period>
节点并找到具有v
属性的<Pos>
和<Qty>
节点.
Code:代码:
import xml.etree.ElementTree as ET
xml = ET.parse("file.xml")
root = xml.getroot()
result = {}
for chapter in root: # root.iterfind(".//chapter")
valueimportant = chapter.find("./valueimportant[@v]")
if valueimportant is not None:
period = chapter.find("./TimeSeries/Period") # chapter.find(".//Period")
if period is not None:
values = {}
for interval in period: # period.iterfind(".//Interval")
pos = interval.find("./Pos[@v]")
qty = interval.find("./Qty[@v]")
if pos is not None and qty is not None:
values[pos.attrib["v"]] = qty.attrib["v"]
result[valueimportant.attrib["v"]] = values
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.