使用 lxml Python 3.5 從 xml 字符串中刪除特定元素

Question

我有以下 xml 作為 python 函數的輸入。 我想找到一個具有 Null 值（（firstChild.nodeValue））的特定元素，並從 xml 中完全刪除它並返回字符串。 我有只使用 lxml 模塊的偶然情況。 我能得到這方面的幫助嗎？

<country name="Liechtenstein">
    <rank></rank>
    <a></a>
    <b></b>
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E">345</neighbor>
</country>

我希望輸出是：-

<country name="Liechtenstein">
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E">345</neighbor>
</country>

我基本上有一個包含標簽名稱的常量列表的靈活性，我可以在其中迭代並找到文本。下面是列表。 a= ('rank','year','a','b','gdppc','neighbor')

請幫忙！

Answer 1

您可以使用聯合來查找單個 xpath 中的所有節點，然后假設您要刪除沒有文本的節點，您可以調用tree.remove(node) ：

x = """<country name="Liechtenstein">
    <rank></rank>
    <a></a>
    <b></b>
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E">345</neighbor>
</country>"""

from lxml import etree


tree = etree.fromstring(x)

a = ('rank','year','a','b','gdppc','neighbor')

for node in tree.xpath("|".join(map("//{}".format, a))):
    if not node.text:
        tree.remove(node)
print(etree.tostring(tree).decode("utf-8"))

這會給你：

<country name="Liechtenstein">
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E">345</neighbor>
</country>

Answer 2

下面的代碼有效:)

def remove_empty_elements(self,xml_input):
    tree = etree.fromstring(xml_input)
    for found in tree.xpath("//*[text()=' ']"):
        print("deleted " + str(found))
        found.getparent().remove(found)
    print(etree.tostring(tree).decode("utf-8"))

使用 lxml Python 3.5 從 xml 字符串中刪除特定元素

問題描述

2 個解決方案

解決方案1
0 2016-09-26 22:36:49

解決方案2
-1 2016-09-27 03:42:07

使用 lxml Python 3.5 從 xml 字符串中刪除特定元素

問題描述

2 個解決方案

解決方案1 0 2016-09-26 22:36:49

解決方案2 -1 2016-09-27 03:42:07

解決方案1
0 2016-09-26 22:36:49

解決方案2
-1 2016-09-27 03:42:07