简体   繁体   中英

What is the best way to compare XML files for equality?

I'm using .NET 2.0, and a recent code change has invalidated my previous Assert.AreEqual call (which compared two strings of XML). Only one element of the XML is actually different in the new codebase, so my hope is that a comparison of all the other elements will give me the result I want. The comparison needs to be done programmatically, since it's part of a unit test.

At first, I was considering using a couple instances of XmlDocument. But then I found this: http://drowningintechnicaldebt.com/blogs/scottroycraft/archive/2007/05/06/comparing-xml-files.aspx

It looks like it might work, but I was interested in Stack Overflow feedback in case there's a better way.

I'd like to avoid adding another dependency for this if at all possible.

Similar questions

It really depends on what you want to check as "differences".

Right now, we're using Microsoft XmlDiff: http://msdn.microsoft.com/en-us/library/aa302294.aspx

You might find it's less fragile to parse the XML into an XmlDocument and base your Assert calls on XPath Query. Here are some helper assertion methods that I use frequently. Each one takes a XPathNavigator, which you can obtain by calling CreateNavigator() on the XmlDocument or on any node retrieved from the document. An example of usage would be:

     XmlDocument doc = new XmlDocument( "Testdoc.xml" );
     XPathNavigator nav = doc.CreateNavigator();
     AssertNodeValue( nav, "/root/foo", "foo_val" );
     AssertNodeCount( nav, "/root/bar", 6 )

    private static void AssertNodeValue(XPathNavigator nav,
                                         string xpath, string expected_val)
    {
        XPathNavigator node = nav.SelectSingleNode(xpath, nav);
        Assert.IsNotNull(node, "Node '{0}' not found", xpath);
        Assert.AreEqual( expected_val, node.Value );
    }

    private static void AssertNodeExists(XPathNavigator nav,
                                         string xpath)
    {
        XPathNavigator node = nav.SelectSingleNode(xpath, nav);
        Assert.IsNotNull(node, "Node '{0}' not found", xpath);
    }

    private static void AssertNodeDoesNotExist(XPathNavigator nav,
                                         string xpath)
    {
        XPathNavigator node = nav.SelectSingleNode(xpath, nav);
        Assert.IsNull(node, "Node '{0}' found when it should not exist", xpath);
    }

    private static void AssertNodeCount(XPathNavigator nav, string xpath, int count)
    {
        XPathNodeIterator nodes = nav.Select( xpath, nav );
        Assert.That( nodes.Count, Is.EqualTo( count ) );
    }

Doing a simple string compare on a xml string not always work. Why ?

for example both :

<MyElement></MyElmennt> and <MyElment/> are equal from an xml standpoint ..

There are algorithms for converting making an xml always look the same, they are called canonicalization algorithms. .Net has support for canonicalization.

I wrote a small library with asserts for serialization, source .

Sample:

[Test]
public void Foo()
{
   ...
   XmlAssert.Equal(expected, actual, XmlAssertOptions.IgnoreDeclaration | XmlAssertOptions.IgnoreNamespaces);
}

NuGet

Because of the contents of an XML file can have different formatting and still be considered the same (from a DOM point of view) when you are testing the equality you need to determine what the measure of that equality is, for example is formatting ignored? does meta-data get ignored etc is positioning important, lots of edge cases.

Generally you would create a class that defines your equality rules and use it for your comparisons, and if your comparison class implements the IEqualityComparer and/or IEqualityComparer<T> interfaces, then your class can be used in a bunch of inbuilt framework lists as the equality test implementation as well. Plus of course you can have as many as you need to measure equality differently as your requirements require.

ie

IEnumerable<T>.Contains
IEnumerable<T>.Equals
The constructior of a Dictionary etc etc

I ended up getting the result I wanted with the following code:

private static void ValidateResult(string validationXml, XPathNodeIterator iterator, params string[] excludedElements)
    {
        while (iterator.MoveNext())
        {
            if (!((IList<string>)excludedElements).Contains(iterator.Current.Name))
            {
                Assert.IsTrue(validationXml.Contains(iterator.Current.Value), "{0} is not the right value for {1}.", iterator.Current.Value, iterator.Current.Name);
            }
        }
    }

Before calling the method, I create a navigator on the instance of XmlDocument this way:

XPathNavigator nav = xdoc.CreateNavigator();

Next, I create an instance of XPathExpression, like so:

XPathExpression expression = XPathExpression.Compile("/blah/*");

I call the method after creating an iterator with the expression:

XPathNodeIterator iterator = nav.Select(expression);

I'm still figuring out how to optimize it further, but it does the trick for now.

I made a method to create simple XML paths.

static XElement MakeFromXPath(string xpath)
{
    XElement root = null;
    XElement parent = null;
    var splits = xpath.Split('/'); //split xpath into parts
    foreach (var split in splits)
    {
        var el = new XElement(split);
        if (parent != null)
            parent.Add(el);
        else
            root = el; //first element created, set as root
        parent = el;
    }
    return root;
}

Sample usage:

var element = MakeFromXPath("My/Path/To/Element")'

element will contain the value:

<My>
  <Path>
    <To>
      <Element></Element>
    </To>
  </Path>
</My>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM