简体   繁体   English

比较 XmlDocument 的相等性(内容明智)

[英]Comparing XmlDocument for equality (content wise)

If I want to compare the contents of a XMlDocument, is it just like this?如果我想比较一个 XMlDocument 的内容,是这样吗?

XmlDocument doc1 = GetDoc1();
XmlDocument doc2 = GetDoc2();

if(doc1 == doc2)
{

}

I am not checking if they are both the same object reference, but if the CONTENTS of the xml are the same.我不是在检查它们是否都是相同的对象引用,而是检查 xml 的内容是否相同。

Try the DeepEquals method on the XLinq API.在 XLinq API 上尝试DeepEquals方法。

XDocument doc1 = GetDoc1(); 
XDocument doc2 = GetDoc2(); 
 
if(XNode.DeepEquals(doc1, doc2)) 
{ 
 
} 

See also Equality Semantics of LINQ to XML Trees另请参阅LINQ to XML 树的相等语义

No. XmlDocument does not override the behavior of the Equals() method so, it is in fact just performing reference equality - which will fail in your example, unless the documents are actually the same object instance.不。 XmlDocument 不会覆盖Equals()方法的行为,因此,它实际上只是执行引用相等 - 这将在您的示例中失败,除非文档实际上是相同的对象实例。

If you want to compare the contents (attributes, elements, commments, PIs, etc) of a document you will have to implement that logic yourself.如果要比较文档的内容(属性、元素、评论、PI 等),则必须自己实现该逻辑。 Be warned: it's not trivial.请注意:这不是微不足道的。

Depending on your exact scenario, you may be able to remove all non-essential whitespace from the document (which itself can be tricky) and them compare the resulting xml text.根据您的具体情况,您可以从文档中删除所有非必要的空格(这本身可能很棘手),然后比较生成的 xml 文本。 This is not perfect - it fails for documents that are semantically identical, but differ in things like how namespaces are used and declared, or whether certain values are escaped or not, the order of elements, and so on.这并不完美 - 对于语义相同但在命名空间的使用和声明方式、某些值是否被转义、元素顺序等方面有所不同的文档,它会失败。 As I said before, XML comparison is not trivial.正如我之前所说,XML 比较并非微不足道。

You also need to clearly define what it means for two XML documents to be "identical".您还需要明确定义两个 XML 文档“相同”意味着什么。 Does element or attribute ordering matter?元素或属性顺序重要吗? Does case (in text nodes) matter?大小写(在文本节点中)重要吗? Should you ignore superfluous CDATA sections?是否应该忽略多余的 CDATA 部分? Do processing instructions count?处理指令算不算? What about fully qualified vs. partially qualified namespaces?完全限定与部分限定的命名空间怎么样?

In any general purpose implementation, you're likely going to want to transform both documents into some canonical form (be it XML or some other representation) and then compare the canonicalized content.在任何通用实现中,您可能希望将两个文档都转换为某种规范形式(无论是 XML 还是某种其他表示形式),然后比较规范化的内容。

Tools already exist that perform XML differencing, like Microsoft XML Diff/Patch , you may be able to leverage that to identify differences between two documents.已经存在执行 XML 差异的工具,例如Microsoft XML Diff/Patch您可以利用它来识别两个文档之间的差异。 To my knowledge that tool is not distributed in source form ... so to use it in an embedded application you would need to script the process (if you plan to use it, you should first verify that the licensing terms allow it's use and redistribution).据我所知,该工具不是以源代码形式分发的......所以要在嵌入式应用程序中使用它,您需要编写该过程的脚本(如果您打算使用它,您应该首先验证许可条款是否允许使用和重新分发)。

EDIT: Check out @Max Toro's answer if you're using .NET 3.5 SP1, as apparently there's an option in XLinq that may be helpful.编辑:如果您使用的是 .NET 3.5 SP1,请查看@Max Toro 的答案,因为显然 XLinq 中有一个选项可能会有所帮助。 Nice to know it exists.很高兴知道它存在。

A simple way could be to compare OuterXml .一个简单的方法是比较OuterXml

var a = new XmlDocument();
var b = new XmlDocument();

a.LoadXml("<root  foo='bar'  />");
b.LoadXml("<root foo='bar'/>");

Debug.Assert(a.OuterXml == b.OuterXml);

LBushkin is right, this is not trivial. LBushkin 是对的,这不是小事。 Since XML is string data you could technically perform a hash of the contents and compare them, but that will be affected by things like whitespace.由于 XML 是字符串数据,您可以在技术上对内容执行散列并进行比较,但这会受到空格等因素的影响。

You could perform a structured diff (also called 'XML diffgram') between the two documents and compare the results.您可以在两个文档之间执行结构化差异(也称为“XML diffgram”)并比较结果。 This is how .NET datasets keep track of changes, for example.例如,这就是 .NET 数据集跟踪更改的方式。

Other than that you'd have to iterate through the DOM and compare elements, attributes and values to each other.除此之外,您必须遍历 DOM 并将元素、属性和值相互比较。 If there's a schema involved then you would also have to take into account positions and so on.如果涉及模式,那么您还必须考虑位置等。

Often You want to compare XML strings ordered differently.通常您想比较不同排序的 XML 字符串。 This can be done easy with this code使用此代码可以轻松完成此操作

class Testing
{
    [Test]
    public void Test()
    {
        Assert.AreEqual(
            "<root><a></a><b></b></root>".SortXml()
            , "<root><b></b><a></a></root>".SortXml());
    }
}

public static class XmlCompareExtension
{
    public static string SortXml(this string @this)
    {
        var xdoc = XDocument.Parse(@this);

        SortXml(xdoc);

        return xdoc.ToString();
    }

    private static void SortXml(XContainer parent)
    {
        var elements = parent.Elements()
            .OrderBy(e => e.Name.LocalName)
            .ToArray();

        Array.ForEach(elements, e => e.Remove());

        foreach (var element in elements)
        {
            parent.Add(element);
            SortXml(element);
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM