简体   繁体   English

从python中的XML文档中提取特定数据

[英]Extract specific data from XML document in python

Section of my XML document 我的XML文档部分

<?xml version="1.0"?>
<orderDocument xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://plmpack.com/stackbuilder/StackBuilderXMLExport.xsd">
  <author>cmj</author>
  <date>2019-02-14T10:45:48.4872033+01:00</date>
  <unit>mm|kg</unit>
  <orderType>
    <orderNumber>Analysis0</orderNumber>
    <loadSpace>
  <id>1</id>
  <name>Pallet0</name>
  <length>1200</length>
  <width>800</width>
  <maxLoadHeight>1500</maxLoadHeight>
  <maxLoadWeight>0</maxLoadWeight>
  <baseHeight>144</baseHeight>
  <maxLengthOverhang>0</maxLengthOverhang>
  <maxWidthOverhang>0</maxWidthOverhang>
</loadSpace>
<item>
  <id>1</id>
  <name>paper0</name>
  <length>320</length>
  <width>260</width>
  <height>120</height>
  <weight>5</weight>
  <maxWeightOnTop>0</maxWeightOnTop>
  <permittedOrientations>001</permittedOrientations>
</item>
<orderLine>
  <itemId>1</itemId>
  <quantity>110</quantity>
</orderLine>
<load>
  <loadSpaceId>1</loadSpaceId>
  <statistics>
    <loadVolume>1098240000</loadVolume>
    <volumeUtilization>84.365781710914447</volumeUtilization>
    <loadWeight>550</loadWeight>
    <weightUtilization>INF</weightUtilization>
    <loadHeight>1320</loadHeight>
    <cOfG>
      <x>0</x>
      <y>0</y>
      <z>0</z>
    </cOfG>
  </statistics>
  <placement>
    <itemId>1</itemId>
    <x>20</x>
    <y>10</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
  </placement>
  <placement>
    <itemId>1</itemId>
    <x>20</x>
    <y>270</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
  </placement>
  <placement>
    <itemId>1</itemId>
    <x>20</x>
    <y>530</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
  </placement>
  <placement>
    <itemId>1</itemId>
    <x>340</x>
    <y>10</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
   </placement>
  </load>     
 </orderType>      
</orderDocument>     

The code i got so far 我到目前为止的代码

import os
import xml.etree.ElementTree as ET

from xml.etree.ElementTree import ElementTree

base_path = os.path.dirname(os.path.realpath(__file__))

xml_file = os.path.join(base_path, "first_try_palletizing.xml")

tree = ET.parse(xml_file)

root = tree.getroot()

The program is for a palletizing robot arm. 该程序用于码垛机器人手臂。 The XML data is from a program that calculates the best possible way to stack objects. XML数据来自一个程序,该程序计算出堆栈对象的最佳方法。 What i need is to extract the "placement" data (x,y,z,L,W), so i can feed it to the robot program. 我需要提取“位置”数据(x,y,z,L,W),以便将其输入到机器人程序中。 I'm completely new in Python, so assume i dont know anything at all. 我是Python的新手,所以假设我什么都不知道。

i've tried the code below, but i cant get deeper than: (orderNumber, loadSpace, item, orderLine, load). 我已经尝试过下面的代码,但是我无法比:(orderNumber,loadSpace,item,orderLine,load)更深入。

for child in root:
    for element in child:
        print(element)

Sorry its a bit messy, but it is my first time using stackoverflow. 抱歉,它有点混乱,但这是我第一次使用stackoverflow。

The code below is bypassing the namespaces and looking for the 'placement' element 下面的代码绕过名称空间并查找“ placement”元素

import xml.etree.ElementTree as ET
from StringIO import StringIO

xml = '''<?xml version="1.0"?>
<orderDocument xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://plmpack.com/stackbuilder/StackBuilderXMLExport.xsd">
  <author>cmj</author>
  <date>2019-02-14T10:45:48.4872033+01:00</date>
  <unit>mm|kg</unit>
  <orderType>
    <orderNumber>Analysis0</orderNumber>
    <loadSpace>
  <id>1</id>
  <name>Pallet0</name>
  <length>1200</length>
  <width>800</width>
  <maxLoadHeight>1500</maxLoadHeight>
  <maxLoadWeight>0</maxLoadWeight>
  <baseHeight>144</baseHeight>
  <maxLengthOverhang>0</maxLengthOverhang>
  <maxWidthOverhang>0</maxWidthOverhang>
</loadSpace>
<item>
  <id>1</id>
  <name>paper0</name>
  <length>320</length>
  <width>260</width>
  <height>120</height>
  <weight>5</weight>
  <maxWeightOnTop>0</maxWeightOnTop>
  <permittedOrientations>001</permittedOrientations>
</item>
<orderLine>
  <itemId>1</itemId>
  <quantity>110</quantity>
</orderLine>
<load>
  <loadSpaceId>1</loadSpaceId>
  <statistics>
    <loadVolume>1098240000</loadVolume>
    <volumeUtilization>84.365781710914447</volumeUtilization>
    <loadWeight>550</loadWeight>
    <weightUtilization>INF</weightUtilization>
    <loadHeight>1320</loadHeight>
    <cOfG>
      <x>0</x>
      <y>0</y>
      <z>0</z>
    </cOfG>
  </statistics>
  <placement>
    <itemId>1</itemId>
    <x>20</x>
    <y>10</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
  </placement>
  <placement>
    <itemId>1</itemId>
    <x>20</x>
    <y>270</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
  </placement>
  <placement>
    <itemId>1</itemId>
    <x>20</x>
    <y>530</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
  </placement>
  <placement>
    <itemId>1</itemId>
    <x>340</x>
    <y>10</y>
    <z>144</z>
    <L>XP</L>
    <W>YP</W>
   </placement>
  </load>     
 </orderType>      
</orderDocument> '''

placements_data = []

it = ET.iterparse(StringIO(xml))
for _, el in it:
    if '}' in el.tag:
        el.tag = el.tag.split('}', 1)[1]  # strip all namespaces
root = it.root
placements = root.findall('.//placement')
for idx, placement in enumerate(placements):
    print('placement # {}'.format(idx))
    for i in range(1, 6):
        child = placement.getchildren()[i]
        print('\t{} - {}'.format(child.tag, child.text))

Output 产量

placement # 0
    x - 20
    y - 10
    z - 144
    L - XP
    W - YP
placement # 1
    x - 20
    y - 270
    z - 144
    L - XP
    W - YP
placement # 2
    x - 20
    y - 530
    z - 144
    L - XP
    W - YP
placement # 3
    x - 340
    y - 10
    z - 144
    L - XP
    W - YP

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM