[英]Merging XML elements while keeping the contents using python
I've been looing around for a method to remove an element from an XML document,while keeping the contents, using Python, but i haven't been able to find an answer that works. 我一直在寻找一种方法来从XML文档中删除元素,同时使用Python保留内容,但是我一直无法找到有效的答案。
Basically, i received an XML document in the following format (example): 基本上,我收到了以下格式的XML文档(示例):
<root>
<element1>
<element2>
<text> random text </text>
</element2>
</element1>
<element1>
<element3>
<text> random text </text>
</element3>
</element1>
</root>
What i have to do is to merge element2 and element3 into element1 such that the output XML document looks like: 我要做的是将element2和element3合并到element1中,以便输出XML文档如下所示:
<root>
<element1>
<element2>
<text> random text </text>
</element2>
<element3>
<text> random text </text>
</element3>
</element1>
</root>
I would appreciate some tips on my (hopefully) simple problem. 对于(希望)这个简单的问题,我将不胜感激。
Note: I am somewhat new to Python as well, so bear with me. 注意:我也是Python的新手,所以请多多包涵。
This might not be the prettiest of solutions, but since there's no other answer yet... 这可能不是最漂亮的解决方案,但是由于没有其他答案了……
You could just search for, eg, </element1><element1>
and replace it with the empty string. 您可以只搜索
</element1><element1>
并将其替换为空字符串。
xml = """<root>
<element1>
<element2>
<text> random text </text>
</element2>
</element1>
<element1>
<element3>
<text> random text </text>
</element3>
</element1>
</root>"""
import re
print re.sub(r"\s*</element1>\s*<element1>", "", xml)
Or more generally, re.sub(r"\\s*</([a-zA-Z0-9_]+)>\\s*<\\1>", "", xml)
to merge all consecutive instances of the same element, by matching the first element name as a group and then looking for that same group with \\1
. 或更一般而言,
re.sub(r"\\s*</([a-zA-Z0-9_]+)>\\s*<\\1>", "", xml)
合并相同的所有连续实例元素,方法是将第一个元素名称匹配为一个组,然后使用\\1
查找相同的组。
Output, in both cases: 在两种情况下的输出:
<root>
<element1>
<element2>
<text> random text </text>
</element2>
<element3>
<text> random text </text>
</element3>
</element1>
</root>
For more complex documents, you might want to use one of Python's many XML libraries instead. 对于更复杂的文档,您可能想要使用Python的许多XML库之一 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.