简体   繁体   English

如何从python中获取(解析)XML子孙

[英]How to get ( parse ) subchild in XML from python

I am new to python or coding , so please be patient with my question, 我是python或编码的新手,所以请耐心等待我的问题,

So here's my busy XML 所以这是我忙碌的XML

    <?xml version="1.0" encoding="utf-8"?>
<Total>
    <ID>999</ID>
    <Response>
        <Detail>
        <Nix>
            <Check>pass</Check>
        </Nix>  
        <MaxSegment>
            <Status>V</Status>
            <Input>
                <Name>
                    <First>jack</First>
                    <Last>smiths</Last>
                </Name>
                <Address>
                <StreetAddress1>100 rodeo dr</StreetAddress1>
                <City>long beach</City>
                <State>ca</State>
                <ZipCode>90802</ZipCode>
                </Address>
                <DriverLicense>
                    <Number>123456789</Number>
                    <State>ca</State>
                </DriverLicense>
                <Contact>
                <Email>x@me.com</Email>
                <Phones>
                    <Home>0000000000</Home>
                    <Work>1111111111</Work>
                </Phones>
                </Contact>
            </Input>
            <Type>Regular</Type>
        </MaxSegment>
        </Detail>
    </Response>
</Total>

what I am trying to do is extract these value into nice and clean table below : 我想要做的是将这些值提取到下面漂亮干净的表中:
在此输入图像描述

Here's my code so far.. but I couldn't figure it out how to get the subchild : 到目前为止,这是我的代码..但我无法弄明白如何获得子代:

   import os
os.chdir('d:/py/xml/')

import xml.etree.ElementTree as ET
tree = ET.parse('xxml.xml')
root=tree.getroot()
x = root.tag
y = root.attrib
print(x,y)

#---PRINT ALL NODES---
for child in root:
    print(child.tag, child.attrib)

Thank you in advance ! 先感谢您 !

You could create a dictionary that maps the column names to xpath expressions that extract corresponding values eg: 您可以创建一个字典,将列名称映射到提取相应值的xpath表达式,例如:

xpath = {
  "ID": "/Total/ID/text()",
  "Check": "/Total/Response/Detail/Nix/Check/text()", # or "//Check/text()"
}

To populate the table row: 要填充表格行:

row = {name: tree.xpath(path) for name, path in xpath.items()}

The above assumes that you use lxml that support the full xpath syntax. 以上假设您使用支持完整xpath语法的lxml ElementTree supports only a subset of XPath expressions but it might be enough in your case (you could remove "text()" expression and use el.text in this case) eg: ElementTree只支持XPath表达式的一个子集,但在你的情况下它可能已经足够了(你可以删除“text()”表达式并在这种情况下使用el.text )例如:

xpath = {
  "ID": ".//ID",
  "Check": ".//Check",
}
row = {name: tree.findtext(path) for name, path in xpath.items()}

To print all text with corresponding tag names: 要打印具有相应标签名称的所有文本:

import xml.etree.cElementTree as etree

for _, el in etree.iterparse("xxm.xml"):
    if el.text and not el: # leaf element with text
       print el.tag, el.text

If column names differ from tag names (as in your case) then the last example is not enough to build the table. 如果列名与标记名不同(如您的情况),则最后一个示例不足以构建表。

This is how you could traverse the tree and print only the text nodes: 这是您遍历树并仅打印文本节点的方法:

def traverse(node):
    show = True
    for c in node.getchildren():
        show = False
        traverse(c)
    if show:
        print node.tag, node.text

for you example I get the following: 为你举例我得到以下内容:

traverse(root)

ID 999
Check pass
Status V
First jack
Last smiths
StreetAddress1 100 rodeo dr
City long beach
State ca
ZipCode 90802
Number 123456789
State ca
Email x@me.com
Home 0000000000
Work 1111111111
Type Regular

Instead of printing out you could store (node.tag, node.text) tuples or store {node.tag: node.text} in a dict. 您可以存储(node.tag, node.text)元组或将{node.tag: node.text}存储在dict中(node.tag, node.text)而不是打印出来。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM