如何使用 lxml iterparse 解析树

Question

This is part of my xml starting from some part:这是我的 xml 的一部分，从某部分开始：

<bigchapter>
     <chapter id="a" name="x">
      <valueimportant v="valuetoget1"/>
      <TimeSeries>
        <TimeSeriesIdentification v="1"/>
        <type v="a1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="26"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="26"/>
          </Interval>
        </Period>
      </TimeSeries>
      <TimeSeries>
        <type v="b1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="26"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="26"/>
          </Interval>
        </Period>
      </TimeSeries>
     </chapter>
     <chapter id="a" name="x">
      <valueimportant v="valuetoget2"/>
      <TimeSeries>
        <TimeSeriesIdentification v="1"/>
        <type v="a1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="154"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="126"/>
          </Interval>
        </Period>
      </TimeSeries>
      <TimeSeries>
        <type v="b1"/>
        <Period>
          <Interval>
            <Pos v="1"/>
            <Qty v="137"/>
          </Interval>
          <Interval>
            <Pos v="2"/>
            <Qty v="148"/>
          </Interval>
        </Period>
      </TimeSeries>
     </chapter>
</bigchapter>

What I want is to create a dictionary with valueimportant as a key and as a value another dictionary with types as keys and dictionary with keys as Pos and Qty as values.我想要的是创建一个以 valueimportant 作为键和值的字典，另一个以类型为键的字典和以键为 Pos 和 Qty 为值的字典。

In return I will be getting:作为回报，我将得到：

{valuetoget1: {a1:{1: 26, 2:26}, b1: {1:26, 2:26}}, valuetoget2: {a1:{1:154, 2:126}, b1:{1:137,2:148}}

I also have some xml before this part of xml, which is irrelevant, I tried this way I am getting the first part of my dictionary, which is keys, but I do not know how to proceed I would be grateful to use lxml etree我在 xml 的这一部分之前也有一些 xml，这是无关紧要的，我试过这种方式我得到了我的字典的第一部分，这是键，但我不知道如何继续我将不胜感激使用 lxml etree

result={}
context = etree.iterparse(file_obj,
                          events=("end",))
for event, elem in context:
    try:
        if elem.tag == 'chapter':
            valueimportant = elem.find('valueimportant')
            if valueimportant.attrib['v'] not in result.keys():
                result[valueimportant.attrib['v']] = {}

    except IndexError or KeyError or ValueError:
        print('error')

Answer 1

I'm not sure that you need lxml there, built-in ElementTree functionality should be enough.我不确定你是否需要lxml ，内置的ElementTree功能应该足够了。 The main task is to collect data, so just iterate over root node processing each <chapter> separately, find <valueimportant> node with v attribute, then iterate over <Period> node and find <Pos> and <Qty> nodes with v attributes.主要任务是收集数据，所以遍历根节点分别处理每个<chapter> ，找到具有v属性的<valueimportant>节点，然后遍历<Period>节点并找到具有v属性的<Pos>和<Qty>节点.

Code:代码：

import xml.etree.ElementTree as ET

xml = ET.parse("file.xml")
root = xml.getroot()

result = {}
for chapter in root:  # root.iterfind(".//chapter")
    valueimportant = chapter.find("./valueimportant[@v]")
    if valueimportant is not None:
        period = chapter.find("./TimeSeries/Period")  # chapter.find(".//Period") 
        if period is not None:
            values = {}
            for interval in period:  # period.iterfind(".//Interval")
                pos = interval.find("./Pos[@v]")
                qty = interval.find("./Qty[@v]")
                if pos is not None and qty is not None:
                    values[pos.attrib["v"]] = qty.attrib["v"]
            result[valueimportant.attrib["v"]] = values

如何使用 lxml iterparse 解析树

问题描述

1 个解决方案

解决方案1
0 2022-02-09 10:03:03

如何使用 lxml iterparse 解析树

问题描述

1 个解决方案

解决方案1 0 2022-02-09 10:03:03

解决方案1
0 2022-02-09 10:03:03