简体   繁体   中英

remove empty tags and the parent if empty as well from xml using python

I am new to python and trying to use lxml to remove the empty tags from XML. I want to get rid of all empty elements and if by doing that parent is empty as well, I want to remove it as well.

Actual XML

<magento_api>
  <data_item>
    <code>400</code>
    <message>Attribute weight is not applicable for product type Configurable Product</message>
  </data_item>
  <data_item>
    <code></code>
    <message>Resource data pre-validation error.</message>
  </data_item>
  <data_item>
    <code>1</code>
    <message></message>
  </data_item>
  <data_item>
    <code></code>
    <message></message>
  </data_item>
</magento_api>

Modified XML

<magento_api>
  <data_item>
    <code>400</code>
    <message>Attribute weight is not applicable for product type Configurable Product</message>
  </data_item>
  <data_item>
    <message>Resource data pre-validation error.</message>
  </data_item>
  <data_item>
    <code>1</code>
  </data_item>
</magento_api>

I have used something like below but it ain't working

from lxml import etree

def recursively_empty(xml_element):
   if xml_element.text:
       return False
   return all((recursively_empty(xe) for xe in xml_element.iterchildren()))


data = """
<magento_api>
<data_item>
 <code>400</code>
<message>Attribute weight is not applicable for product type Configurable Product</message>
</data_item>
<data_item>
<code>400</code>
<message></message>
</data_item>
<data_item>
<code></code>
<message>abc</message>
</data_item>
<data_item>
<code></code>
<message></message>
</data_item>
</magento_api>
"""

xml_root = etree.fromstring(data)

for action, xml_element in xml_root:
    parent = xml_element.getparent()
    if recursively_empty(xml_element):
        parent.remove(xml_element)

print (etree.tostring(xml_root))

One thing you could do is use the normalize-space() xpath function on each element to get the string value. If it's empty, remove the element.

Example...

Python (Note: I used your "Actual XML" example; not the XML you had in your Python.)

from lxml import etree

data = """
<magento_api>
  <data_item>
    <code>400</code>
    <message>Attribute weight is not applicable for product type Configurable Product</message>
  </data_item>
  <data_item>
    <code></code>
    <message>Resource data pre-validation error.</message>
  </data_item>
  <data_item>
    <code>1</code>
    <message></message>
  </data_item>
  <data_item>
    <code></code>
    <message></message>
  </data_item>
</magento_api>
"""

parser = etree.XMLParser(remove_blank_text=True)
xml_root = etree.fromstring(data, parser=parser)

for xml_element in xml_root.iter():
    content = xml_element.xpath('normalize-space()')
    if not content:
        xml_element.getparent().remove(xml_element)

print(etree.tostring(xml_root, pretty_print=True).decode())

Output

<magento_api>
  <data_item>
    <code>400</code>
    <message>Attribute weight is not applicable for product type Configurable Product</message>
  </data_item>
  <data_item>
    <message>Resource data pre-validation error.</message>
  </data_item>
  <data_item>
    <code>1</code>
  </data_item>
</magento_api>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM