简体   繁体   English

使用python比较xml文件

[英]compare xml files using python

I want to compare these two xml files:我想比较这两个 xml 文件:

File1.xml:文件1.xml:

<ngs_sample id="40332">
  <workflow value="salmonella" version="101_provisional" />
  <results>
  <gastro_prelim_st reason="not novel" success="false">
      <type st="1364" />
      <type st="9999" />
  </gastro_prelim_st>
 </results>
</ngs_sample>

File2.xml:文件2.xml:

<ngs_sample id="40332">
  <workflow value="salmonella" version="101_provisional" />
  <results>
  <gastro_prelim_st reason="not novel" success="false">
      <type st="1364" />
   </gastro_prelim_st>
 </results>
</ngs_sample>

I've used xmldiff to compare a.xml with b.xml:我使用xmldiff将 a.xml 与 b.xml 进行比较:

def compare_xmls(observed,expected):

    from xmldiff import main, formatting
    formatter = formatting.DiffFormatter()
    diff = main.diff_files(observed,expected,formatter=formatter)
    return diff

out = compare_xmls(a.xml, b.xml)
print(out)

OUTPUT:输出:

[delete, /ngs_sample/results/gastro_prelim_st/type[2]]

Anyone know how to identify what is the difference between the two xml files, ie what has been deleted compared to the file b.xml.任何人都知道如何识别两个xml文件之间的区别是什么,即与文件b.xml相比删除了什么。 Anyone recommend any other way of comparing xml files in python?有人推荐在python中比较xml文件的任何其他方法吗?

You can switch to the XMLFormatter and manually filter out the results:您可以切换到XMLFormatter并手动过滤结果:

...
# Change formatter:
formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

...

# after `out` has been retrieved:
import re
for i in out.splitlines():
  if re.search(r'\bdiff:\w+', i):
    print(i)

# Result:
#       <type st="9999" diff:delete=""/>

Use the xmldiff to perform this exact task.使用xmldiff来执行这个确切的任务。

main.py主文件

from xmldiff import main
diff = main.diff_files("file1.xml", "file2.xml")
print(diff)

output输出

[DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]

Another option is use xml2 https://github.com/clone/xml2 (and something like bash process substitution)另一种选择是使用xml2 https://github.com/clone/xml2 (以及类似bash进程替换的东西)

$ diff --color <(xml2 < File1.xml) <(xml2 < File2.xml)

7,8d6
< /ngs_sample/results/gastro_prelim_st/type
< /ngs_sample/results/gastro_prelim_st/type/@st=9999

Try XmlXdiff , currently only svg output available.尝试XmlXdiff ,目前只有 svg 输出可用。 But it should be quite simple to provied a text output or an interface class.但是提供文本输出或接口类应该非常简单。

Some example code from the XmlXDiff website:来自 XmlXDiff 网站的一些示例代码:

from XmlXdiff.XReport import DrawXmlDiff

_xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
_xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

with open("test1.xml", "w") as f:
    f.write(_xml1)

with open("test2.xml", "w") as f:
    f.write(_xml2)

x = DrawXmlDiff("test1.xml", "test2.xml")
x.saveSvg('xdiff.svg')

Example Output示例输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM