简体   繁体   English

解析python中有嵌套标签的xml文件

[英]parse a xml file in python which nested tags are there

i just want to parse one xml file which is like as 我只想解析一个xml文件,就像

 <?xml version="1.0" encoding="UTF-8"?><Significant Major="3" Minor="0" Revision="1" xmlns="urn:reuterscompanycontent:significantdevelopments03"><RepNo>0091N</RepNo><CompanyName Type="Primary">XYZ</CompanyName><Production Date="2017-02-23T18:10:39" /><Developments><Development ID="3534388"><Dates><Source>2017-02-23T18:18:32</Source><Initiation>2017-02-23T18:18:32</Initiation><LastUpdate>2017-02-23T18:23:26</LastUpdate></Dates><Flags><FrontPage>0</FrontPage><Significance>1</Significance></Flags><Topics><Topic1 Code="254">Regulatory / Company Investigation</Topic1></Topics><Headline>FTC approves final order settling charges for Abbott's deal with St. Jude Medical</Headline></Development></Developments></Significant>

I just want to parse the Development tag and parse its every nested tag i have below code: 我只想解析Development标记并解析其每个嵌套的标记,而我的代码如下:

import xml.etree.cElementTree as ET
tree = ET.ElementTree(file='../rawdata/SigDev_0091N.xml')

#get the root element
root = tree.getroot()

#print root.tag, root.attrib

for child in root:
#print child.tag, child.attrib
    name = child.tag
    print name
    print 'at line 13'
    if name is 'Developments':
        print 'at line 15'
        for devChild in name['Developments']:
            print devChild.tag,devChild.attrib

it is not going inside the if block, i dont know why? 它不在if块内,我不知道为什么?

Checking name is 'Developments' always return false as child.tag is returning the value in {xmlns}tagname format. 检查name is 'Developments'总是返回false因为child.tag{xmlns}tagname格式返回值。

For your case: 对于您的情况:

name = {urn:reuterscompanycontent:significantdevelopments03}Developments 名称= {ur:reuterscompanycontent:significantdevelopments03}发展

You may refer to the answer of this question . 您可以参考这个问题的答案。

Simple string methods strip() , find() , split() or re can help you to skip the namespace for comparison. 简单的字符串方法strip()find()split()re可以帮助您跳过名称空间进行比较。

Python documentation related: https://docs.python.org/2/library/xml.etree.elementtree.html#parsing-xml-with-namespaces 与Python相关的文档: https : //docs.python.org/2/library/xml.etree.elementtree.html#parsing-xml-with-namespaces

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM