繁体   English   中英

如何使用Python删除Root标记并在xml中保留所有行标记

[英]How to remove Root tag and keep rest all row tags in an xml using python

我有下面的XML文件。

<root>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
</catalog>
<catalog>
   <book id="bk102">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>45.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
</catalog>
<catalog>
   <book id="bk103">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>46.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
</catalog>
</root>

我想通过消除标签来创建另一个XML。 因此,我的新XML看起来像-

<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
</catalog>
<catalog>
   <book id="bk102">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>45.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
</catalog>
<catalog>
   <book id="bk103">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>46.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
</catalog>

以下是我的代码,通过消除和保留所有必要的行标记,我可以生成字节类。 但最终无法将我的字节类转换为xml格式并出现以下错误:

xml.etree.ElementTree.ParseError:文档元素后出现垃圾:第11行,第0列

你能帮忙吗?

import xml.etree.ElementTree as ET

base_tree = ET.parse('input.xml')
catalog = list(base_tree.getroot())
elemList = []
for elem in catalog:
  getele = ET.tostring(elem, 'utf-8')
  elemList.append(getele)

byt = b''.join(elemList)
print(byt)

mytree = ET.ElementTree(ET.fromstring(byt))
dis = str(ET.tostring(mytree.getroot()), 'utf-8')

您可以为此使用列表

with open('input.xml') as input_file:
    text = input_file.read()
    catalog = list(ET.fromstring(text))[0]
    ET.tostring(catalog, encoding='utf8', method='xml')

虽然结果字符串将不是有效的XML。

根元素对于XML是必不可少的。

对于仅文本处理,也许我们可以做

import re
pattern = re.compile("<[/]{0,1}root>")
removed = re.sub(pattern, '', "<root>something</root>");

print(removed)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM