简体   繁体   English

如何获取python中XML命名空间的前缀部分?

[英]How to get the prefix part of XML namespace in python?

I have the following XML (in brief):我有以下 XML(简要):

<?xml version="1.0" encoding="iso-8859-1"?>
<SOAP-ENV:Envelope 
    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <SOAP-ENV:Body>
        <Proposal 
            xmlns="http://www.opengis.net/AAA"  
            xmlns:apd="http://www.opengis.net/BBB" 
            xmlns:common="http://www.opengis.net/DDD"  
            xmlns:core="http://www.opengis.net/EEE" 
            xmlns:pdt="http://www.opengis.net/CCC" 
            xmlns:xlink="http://www.opengis.net/FFF" 
            xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            
            <SchemaVersion>1.3</SchemaVersion>
            <ApplicationHeader>
                <ApplicationTo>W1234                         </ApplicationTo>
                <DateSubmitted>2021-04-26</DateSubmitted>
# ...
            <Agent>
                <common:PersonName>
                    <pdt:PersonNameTitle>Mr </pdt:PersonNameTitle>
                    <pdt:PersonGivenName>Holmes</pdt:PersonGivenName>
                    <pdt:PersonFamilyName>Sherlock</pdt:PersonFamilyName>
                </common:PersonName>
                <common:OrgName>Bad Company LTD</common:OrgName>
        </Proposal>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

I am trying to extract the XML tags and the prefix part of the namespace .我正在尝试提取 XML标签名称空间prefix部分。 (Basically to the way it looks in the XML.) Using Python 3.10.3 , I have tried many variations of the following. (基本上它在 XML 中的样子。)使用 Python 3.10.3 ,我尝试了以下的许多变体。


from lxml import html, etree
...

def list_xml_tags(xml_blob):
    print(' [INFO] Printing all XML tags:')
    
    xml = etree.fromstring(bytes(xml_blob, encoding='utf-8'))  
    root = etree.Element("root")
    
    print('Root TAG: {}'.format(root))
    print('nsmap : {}'.format(root.nsmap))
    print('\nDescendants:')
    
    for el in xml.iter():
        el.tag = el.xpath('local-name()')
        #ns = el.xpath('namespace-uri()')
        #ns = etree.QName(el).namespace
        #ns = root.nsmap
        ns = etree.QName(el).namespace
        if el.attrib == None: el.attrib =''
        print('{} : {}  : {}'.format(ns, el.tag, el.attrib))

However, this is not working.但是,这是行不通的。 I am not able to get the namespace at all using this.我根本无法使用它来获取命名空间。 The only thing that comes out is None .唯一出来的是None (Also not sure why the root tag is shown as an address.) (也不确定为什么根标签显示为地址。)

 [INFO] Printing all XML tags:
 ------------------------------------------------------------
Root TAG: <Element root at 0x16a85fc2340>
nsmap : {}

Descendants:
None : Envelope  : {}
None : Body  : {}
None : Proposal  : {}
None : SchemaVersion  : {}
...

Q: How can I get the following output?问:如何获得以下output?

SOAP-ENV : Envelope
pdt      : PersonGivenName
common   : OrgName
...

etc.等等

With python3用python3

from lxml import etree                                  
doc = etree.parse('tmp.xml')
# namespace reverse lookup dict
ns = { value:(key if key is not None else 'default') for (key,value) in set(doc.xpath('//*/namespace::*'))}
for ele in doc.iter():
    qn = etree.QName(ele)
    print(f"{ns[qn.namespace]:>30} : {qn.localname}")

Result:结果:
Those with default prefix belong to the default namespace without prefix xmlns="http://www.opengis.net/AAA"default前缀的属于默认不带前缀的命名空间xmlns="http://www.opengis.net/AAA"

       SOAP-ENV : Envelope
       SOAP-ENV : Body
        default : Proposal
        default : SchemaVersion
        default : ApplicationHeader
        default : ApplicationTo
        default : DateSubmitted
        default : Agent
         common : PersonName
            pdt : PersonNameTitle
            pdt : PersonGivenName
            pdt : PersonFamilyName
         common : OrgName

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在python中将xml元素对象转换为带有名称空间前缀的字符串? - How to convert an xml element object into a string with a namespace prefix in Python? 如何使用 Python LXML 在根 XML 中添加名称空间前缀? - How to add namespace prefix at root XML using Python LXML? 使用 python-docx,如何关联 XML 命名空间前缀? - Using python-docx, how can I associate an XML namespace prefix? XML中具有相同的前缀和多个名称空间-如何在Python中添加元素属性而不影响其他属性 - Same prefix, multiple namespace in XML - How to add element attrib without affecting other in Python Python:如何导入命名空间的一部分 - Python: How to import part of a namespace 如何使用公共命名空间前缀创建 2 个 python 包 - how to create 2 python packages with a common namespace prefix 由于名称空间前缀问题Python无法从XML检索注释 - Unable to retrieve comment from XML due to namespace prefix issue Python 使用lxml从python中的xml中删除命名空间和前缀 - Remove namespace and prefix from xml in python using lxml python 2.7 XML lxml名称空间前缀属性问题 - python 2.7 XML lxml namespace prefix attribute issues Python解析XML提要错误:XPathEvalError:未定义的名称空间前缀 - Python Parse XML feed error: XPathEvalError: Undefined namespace prefix
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM