简体   繁体   English

如何删除空xml标记中的多余空间

[英]How can I remove extra space in empty xml tags

I have a xml file where I'm looking for specific tag (for example: tag <x> ) and if I find him I replace/update its value to specific text (for example: test ). 我有一个xml文件,我正在其中寻找特定标签(例如:tag <x> ),如果找到他,我会将其值替换/更新为特定文本(例如: test )。

Python version 3.5.0. Python版本3.5.0。

Sample xml file: 样本xml文件:

<root>
 <a/>
 <b>0</b>
 <c/>
 <x>some value</x>
</root>

This is my code: 这是我的代码:

from xml.etree import ElementTree as et

datafile = 'input.xml'     # path to the source xml file
datafile_out = 'output.xml'    # path to the updated xml
tree = et.parse(datafile)
tree.find('.//x').text ='TEST'  # find <x> tag and write there value "TEST"
tree.write(datafile_out)    #generating updated xml file

And this is my output: 这是我的输出:

<root>
 <a />
 <b>0</b>
 <c />
 <x>TEST</x>
</root>

Everything works as expected. 一切正常。

But my problem is with extra space in empty tags: <a /> between tag name "a" and "slash" which wasn't present in input xml file. 但是我的问题是空标记中有多余的空间:输入名称xml中不存在的标记名称“ a”“斜杠”之间的<a />

I'm working with quite big xml files with a lot of empty tags so every extra space makes this files a lot bigger. 我正在使用带有大量空标签的大型xml文件,因此每个额外的空间都会使该文件更大。

Is there any possible way to stop ElementTree.write() to add that extra space? 有什么办法可以停止ElementTree.write()添加额外的空间?

Note: I would like to use build in Python modules and not install third parties solutions. 注意:我想使用Python中的内置模块,而不要安装第三方解决方案。

Many thanks for your advices! 非常感谢您的建议!

Have you tried using regular expressions. 您是否尝试过使用正则表达式。

As an example: 举个例子:

yourXmlAsString.replaceAll(">\\s*<", "><"); yourXmlAsString.replaceAll(“> \\ s * <”,“> <”); Would remove all whitespaces between every XML element. 将删除每个XML元素之间的所有空格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM