简体   繁体   English

如何使用 lxml iterparse 解析树

[英]How to parse tree using lxml iterparse

This is part of my xml starting from some part:这是我的 xml 的一部分,从某部分开始:

<bigchapter>
     <chapter id="a" name="x">
      <valueimportant v="valuetoget1"/>
      <TimeSeries>
        <TimeSeriesIdentification v="1"/>
        <type v="a1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="26"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="26"/>
          </Interval>
        </Period>
      </TimeSeries>
      <TimeSeries>
        <type v="b1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="26"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="26"/>
          </Interval>
        </Period>
      </TimeSeries>
     </chapter>
     <chapter id="a" name="x">
      <valueimportant v="valuetoget2"/>
      <TimeSeries>
        <TimeSeriesIdentification v="1"/>
        <type v="a1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="154"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="126"/>
          </Interval>
        </Period>
      </TimeSeries>
      <TimeSeries>
        <type v="b1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="137"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="148"/>
          </Interval>
        </Period>
      </TimeSeries>
     </chapter>
</bigchapter>

What I want is to create a dictionary with valueimportant as a key and as a value another dictionary with types as keys and dictionary with keys as Pos and Qty as values.我想要的是创建一个以 valueimportant 作为键和值的字典,另一个以类型为键的字典和以键为 Pos 和 Qty 为值的字典。

In return I will be getting:作为回报,我将得到:

{valuetoget1: {a1:{1: 26, 2:26}, b1: {1:26, 2:26}}, valuetoget2: {a1:{1:154, 2:126}, b1:{1:137,2:148}}

I also have some xml before this part of xml, which is irrelevant, I tried this way I am getting the first part of my dictionary, which is keys, but I do not know how to proceed I would be grateful to use lxml etree我在 xml 的这一部分之前也有一些 xml,这是无关紧要的,我试过这种方式我得到了我的字典的第一部分,这是键,但我不知道如何继续我将不胜感激使用 lxml etree

result={}
context = etree.iterparse(file_obj,
                          events=("end",))
for event, elem in context:
    try:
        if elem.tag == 'chapter':
            valueimportant = elem.find('valueimportant')
            if valueimportant.attrib['v'] not in result.keys():
                result[valueimportant.attrib['v']] = {}

    except IndexError or KeyError or ValueError:
        print('error')

I'm not sure that you need lxml there, built-in ElementTree functionality should be enough.我不确定你是否需要lxml ,内置的ElementTree功能应该足够了。 The main task is to collect data, so just iterate over root node processing each <chapter> separately, find <valueimportant> node with v attribute, then iterate over <Period> node and find <Pos> and <Qty> nodes with v attributes.主要任务是收集数据,所以遍历根节点分别处理每个<chapter> ,找到具有v属性的<valueimportant>节点,然后遍历<Period>节点并找到具有v属性的<Pos><Qty>节点.

Code:代码:

import xml.etree.ElementTree as ET

xml = ET.parse("file.xml")
root = xml.getroot()

result = {}
for chapter in root:  # root.iterfind(".//chapter")
    valueimportant = chapter.find("./valueimportant[@v]")
    if valueimportant is not None:
        period = chapter.find("./TimeSeries/Period")  # chapter.find(".//Period") 
        if period is not None:
            values = {}
            for interval in period:  # period.iterfind(".//Interval")
                pos = interval.find("./Pos[@v]")
                qty = interval.find("./Qty[@v]")
                if pos is not None and qty is not None:
                    values[pos.attrib["v"]] = qty.attrib["v"]
            result[valueimportant.attrib["v"]] = values

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM