简体   繁体   中英

How to check XML nodes contained in different XML files for equality?

I have two XML files (file A and file B where file A is a subset of file B) which I read using the System.Xml.XmlDocument.LoadXml(fileName) method.

I am then selecting nodes within these files using the System.Xml.XmlNode.SelectNodes(nodeName) I need to compare that each selected xml node in file A is either equal or a subset of that same node in file B. Need to also check that the order of the subnodes contained within any node in file A is the same of the order of those same subnodes contained within that node in fileB.

For example,

fileA

<rootNodeA>
 <elementA>
  <subelementA>content</subElementA>
  <subelementB>content</subElementB>
  <subelementB>content</subElementC>
  <subelementB>content</subElementD>
 </elementA>
 <elementB>
  <subelementA>content</subElementA>
  <subelementB>content</subElementB>
 </elementB>
</rootNodeA>

fileB

<rootNodeB>
 <elementA>
  <subelementB>content</subElementB>
  <subelementD>content</subElementD>
 </elementA>
 <elementB>
  <subelementA>content</subElementA>
 </elementB>
</rootNodeB>

As you see, fileB is a subset of fileA. I need to check that elementA node of file B is equal or a subset of that same elementA node in file A. This should be true for the subnodes ( subElementA , etc.) as well and the content of the nodes/subnodes.

Also, if you see elementA in fileA, there are 4 subelements in the order A,B,C,D. For that same elementA in fileB, there are 2 subelements in the order A,D. This order ie A comes before D is same as the order in file A, need to check this as well.

My idea is to compute Hashes of the nodes and then compare them but unsure of how or if this would satisfy the purpose.

EDIT: Code I have so far,

    HashSet<XmlElement> hashA = new HashSet<XmlElement>();
    HashSet<XmlElement> hashB = new HashSet<XmlElement>();

                foreach (XmlElement node in nodeList)
                {
                    hashA.Add(node);
                }
                foreach(XmlElement node in masterNodeList)
                {
                    hashB.Add(node);
                }
                isSubset = new HashSet<XmlElement>(hashA).IsSubsetOf(hashB);
            return isSubset;

this sounds like a simple recursive function.

didn't check if it actually work, but that should do it:

public static bool isSubset(XmlElement source, XmlElement target)
    {
        if (!target.HasChildNodes)
        {
            if (source.HasChildNodes) // surly not same.
                return false;
            return string.Equals(source.Value, target.Value); // equalize values.
        }

        var sourceChildren = source.ChildNodes.OfType<XmlElement>().ToArray(); // list all child tags in source (by order)
        var currentSearchIndex = 0; // where are we searching from (where have we found our match)

        foreach (var targetChild in target.ChildNodes.OfType<XmlElement>())
        {
            var findIndex = Array.FindIndex(sourceChildren, currentSearchIndex, el => el.Name == targetChild.Name);
            if (findIndex == -1)
                return false; // not found in source, therefore not a subset.

            if (!isSubset(sourceChildren[findIndex], targetChild))
                return false; // if the child is not a subset, then parent isn't too.

            currentSearchIndex = findIndex; // increment our search index so we won't match nodes that already passed.
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM