簡體   English   中英

Python LXML:如何使用XPath選擇器獲取XML標簽名稱?

[英]Python lxml: how to fetch XML tag names with xpath selector?

我正在嘗試使用Python和lxml解析以下XML:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/bind9.xsl"?>
<isc version="1.0">
  <bind>
    <statistics version="2.2">
      <memory>
        <summary>
          <TotalUse>1232952256
          </TotalUse>
          <InUse>835252452
          </InUse>
          <BlockSize>598212608
          </BlockSize>
          <ContextSize>52670016
          </ContextSize>
          <Lost>0
          </Lost>
        </summary>
      </memory>
    </statistics>
  </bind>
</isc>

目的是提取bind/statistics/memory/summary下每個元素的標簽名稱和文本,以產生以下映射:

TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0

我已經設法提取了元素值,但是我無法弄清楚xpath表達式來獲取元素標簽名稱。

示例腳本:

from lxml import etree as et

def main():

    xmlfile = "bind982.xml"
    location = "bind/statistics/memory/summary/*"
    label_selector = "??????" ## what to put here...?
    value_selector = "text()"

    with open(xmlfile, "r") as data:
        xmldata = et.parse(data)

        etree = xmldata.getroot()

        statlist = etree.xpath(location)

        for stat in statlist:
            label = stat.xpath(label_selector)[0]
            value = stat.xpath(value_selector)[0]
            print "{0}: {1}".format(label, value)

if __name__ == '__main__':
    main()

我知道我可以使用value = stat.tag而不是stat.xpath() ,但是腳本必須足夠通用才能處理標簽選擇器不同的其他XML片段。

哪個xpath選擇器將返回元素的標簽名稱?

只需使用XPath的name()並刪除零索引,因為這將返回一個字符串而不是列表。

from lxml import etree as et

def main():

    xmlfile = "ExtractXPathTagName.xml"
    location = "bind/statistics/memory/summary/*"
    label_selector = "name()"                         ## what to put here...?
    value_selector = "text()"

    with open(xmlfile, "r") as data:
        xmldata = et.parse(data)

        etree = xmldata.getroot()

        statlist = etree.xpath(location)

        for stat in statlist:
            label = stat.xpath(label_selector)
            value = stat.xpath(value_selector)[0]
            print("{0}: {1}".format(label, value).strip())

if __name__ == '__main__':
    main()

產量

TotalUse: 1232952256    
InUse: 835252452    
BlockSize: 598212608    
ContextSize: 52670016    
Lost: 0

我認為您不需要兩個值的XPath,元素節點具有屬性tagtext因此例如使用列表理解:

[(element.tag, element.text) for element in etree.xpath(location)]

或者,如果您真的想使用XPath

result = [(element.xpath('name()'), element.xpath('string()')) for element in etree.xpath(location)]

您當然也可以構造一個詞典列表:

result = [{ element.tag : element.text } for element in root.xpath(location)]

要么

result = [{ element.xpath('name()') : element.xpath('string()') } for element in etree.xpath(location)]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM