So I'm new using python. I'm trying to remove an xml tag from an xml document. Trying to remove ALL of <tag2>
and </tag2>
tags, but keep the "foo" and "bar". Suggestions? Trying to avoid lxml
<entry name="xml">
<tag>
<tag2>foo</tag2>
</tag>
<tag3>
<tag2>bar</tag2>
</tag3>
<tag4>
<tag2>foo</tag2>
</tag4>
<tag5>
<tag2>bar</tag2>
</tag5>
</entry>
EDIT: Here's what I need the output to be
entry name="xml">
<tag>
foo
</tag>
<tag3>
bar
</tag3>
<tag4>
foo
</tag4>
<tag5>
bar
</tag5>
</entry>
You could iterate over the element tree with xml. This creates a list of all the tags with text in them.
import xml.etree.ElementTree as ET
tree = ET.parse('x.xml')
root = tree.getroot()
text = []
for child in tree.iter():
if '\n' not in child.text:
text.append(child.text)
Or a simpler statement from David Zemens
text = [child.text for child in tree.iter() if not child.text.strip() == '']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.