[英]how to parse xml with multiple root element
I need to parse both var
& group
root elements. 我需要解析
var
和group
根元素。
Code 码
import xml.etree.ElementTree as ET
tree_ownCloud = ET.parse('0020-syslog_rules.xml')
root = tree_ownCloud.getroot()
Error 错误
xml.etree.ElementTree.ParseError: junk after document element: line 17, column 0
xml.etree.ElementTree.ParseError:文档元素后出现垃圾:第17行,第0列
Sample XML 样本XML
<var name="BAD_WORDS">core_dumped|failure|error|attack| bad |illegal |denied|refused|unauthorized|fatal|failed|Segmentation Fault|Corrupted</var>
<group name="syslog,errors,">
<rule id="1001" level="2">
<match>^Couldn't open /etc/securetty</match>
<description>File missing. Root access unrestricted.</description>
<group>pci_dss_10.2.4,gpg13_4.1,</group>
</rule>
<rule id="1002" level="2">
<match>$BAD_WORDS</match>
<options>alert_by_email</options>
<description>Unknown problem somewhere in the system.</description>
<group>gpg13_4.3,</group>
</rule>
</group>
I tried following couple of other questions on stackoverflow here , but none helped. 我在这里尝试了以下关于stackoverflow的其他几个问题,但是没有一个帮助。
I know the reason, due to which it is not getting parsed, people have usually tried hacks. 我知道原因,由于它没有得到解析,所以人们通常尝试使用hack。 IMO it's a very common usecase to have multiple root elements in XML, and something must be there in ET parsing library to get this done.
IMO这是一个非常常见的用例,它在XML中具有多个根元素,并且ET解析库中必须存在某些元素才能完成此操作。
As mentioned in the comment, an XML file cannot have multiple roots. 如注释中所述,XML文件不能具有多个根。 Simple as that.
就那么简单。
If you do receive/store data in this format (and then it's not proper XML). 如果您确实以这种格式接收/存储数据(那么它就是不正确的XML)。 You could consider a hack of surrounding what you have with a fake tag, eg
您可以考虑用假标签包围您的物品,例如
import xml.etree.ElementTree as ET
with open("0020-syslog_rules.xml", "r") as inputFile:
fileContent = inputFile.read()
root = ET.fromstring("<fake>" + fileContent +"</fake>")
print(root)
Actually, the example data is not a well-formed XML document, but it is a well-formed XML entity. 实际上,示例数据不是格式良好的XML文档,而是格式良好的XML实体。 Some XML parsers have an option to accept an entity rather than a document, and in XPath 3.1 you can parse this using the parse-xml-fragment() function.
一些XML解析器可以选择接受实体而不是文档,并且在XPath 3.1中,您可以使用parse-xml-fragment()函数进行解析。
Another way to parse a fragment like this is to create a wrapper document which references it as an external entity: 解析片段的另一种方法是创建一个包装器文档,该文档将其引用为外部实体:
<!DOCTYPE wrapper [
<!ENTITY e SYSTEM "fragment.xml">
]>
<wrapper>&e;</wrapper>
and then supply this wrapper document as the input to your XML parser. 然后将此包装器文档提供为XML解析器的输入。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.