简体   繁体   English

使用Linq和XElement检测XML中的结构差异

[英]Detecting structural differences in XML using Linq and XElement

I'm trying to audit some XML that is used in a bespoke piece of software. 我正在尝试审核定制软件中使用的一些XML。 Im able to detect changes in identical structures using 'XNode.DeepEquals' and then adding an extra attribute to the elements that have changed so I can highlight them. 我无法使用“ XNode.DeepEquals”检测相同结构中的更改,然后向已更改的元素添加了额外的属性,因此我可以突出显示它们。

My problem is that, when the structure does change this methodology fails. 我的问题是,当结构确实发生更改时,此方法会失败。 ( I'm enumerating over both XElements at the same time performing a DeepEquals, if they are not equal - recursively calling the same method to filter out where the exact changes occurr ) (如果两个XElement不相等,我将同时对两个XElement进行枚举-递归调用相同的方法以过滤出发生确切更改的位置)

Obviously this now falls apart when I'm enumerating and the nodes being compared are not the same. 显然,当我进行枚举并且要比较的节点不相同时,这现在分崩离析。 See Below Sample: 参见以下示例:

Before 之前

<?xml version="1.0" encoding="utf-16"?>
<Prices xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Price default="true">
    <Expression operator="Addition">
        <LeftOperand>
            <AttributeValue field="ccx_bandwidth" />
        </LeftOperand>
        <RightOperand>
            <Constant value="10" type="Integer" />
        </RightOperand>
    </Expression>
</Price>
<Price default="false">
    <Expression operator="Addition">
        <LeftOperand>
            <AttributeValue field="ccx_bandwidth" />
        </LeftOperand>
        <RightOperand>
            <Constant value="99" type="Integer" />
        </RightOperand>
    </Expression>
</Price>
<RollupChildren />

After

<?xml version="1.0" encoding="utf-16"?>
<Prices xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Price default="true">
    <Expression operator="Addition">
        <LeftOperand>
            <AttributeValue field="ccx_bandwidth" />
        </LeftOperand>
        <RightOperand>
            <Constant value="10" type="Integer" />
        </RightOperand>
    </Expression>
</Price>
<RollupChildren />

So you can see that the latter Price Node has been removed and I need to show this change. 因此,您可以看到后面的价格节点已被删除,我需要显示此更改。

At the moment I have access to both pieces of xml and modify them on load of the audit application with an 'auditchanged' attribute which in my silverlight app i bind the background too with a converter. 目前,我可以访问这两段xml,并在审核应用程序加载时使用“ auditchanged”属性对其进行修改,在我的Silverlight应用程序中,我也将背景绑定到转换器。

I'd been playing around with Linq to Xml and looking at joining the two XElements in a query but wasn't sure how to proceed. 我一直在用Linq到Xml玩耍,并希望在查询中加入两个XElement,但是不确定如何进行。

Ideally what I would like to do is merge the two XElements together but add a seperate attribute depending on if it's added or removed which i can then bind to with a converter to say highlight in red or green appropriately. 理想情况下,我想将两个XElement合并在一起,但是添加一个单独的属性,具体取决于是否添加或删除了它,然后我可以将其绑定到转换器以适当地用红色或绿色突出显示。

Does anyone have any bright ideas on this one? 有人对此有什么好主意吗? ( I'd been looking at XmlDiff however I can't use that in Silverlight, I don't think? ) (我一直在看XmlDiff,但是我不能在Silverlight中使用它,我不认为吗?)

The important part here is the descendants query. 这里的重要部分是后代查询。 It turns every element in the first doc in a list of its ancestors, where every item contains the name of the element and it's index among its siblings of the same name. 它会在其祖先列表中的第一个文档中转换每个元素,其中每个项目都包含该元素的名称及其同名兄弟姐妹之间的索引。 I think this can be somehow used for joining, though I have no idea how to do full outer join with linq. 我认为这可以以某种方式用于联接,尽管我不知道如何使用linq进行完全外部联接。 So instead i just use these lists to find elements in the second document, and then depending on the result, probably mark it as either deleted or changed. 因此,我只是使用这些列表在第二个文档中查找元素,然后根据结果将其标记为已删除或已更改。

var doc = XDocument.Load(in_A);
var doc2 = XDocument.Load(in_B);
var descendants = doc.Descendants().Select(d => 
    d.AncestorsAndSelf().Reverse().Select(el => 
        new {idx = el.ElementsBeforeSelf(el.Name).Count(), el, name = el.Name}).ToList());

foreach (var list in descendants) {
    XContainer el2 = doc2;
    var el = list.Last().el;
    foreach (var item in list) {
        if (el2 == null) break;
        el2 = el2.Elements(item.name).Skip(item.idx).FirstOrDefault();
    }
    string changed = "";
    if (el2 == null) changed += " deleted";
    else {
        var el2e = el2 as XElement;
        if (el2e.Attributes().Select(a => new { a.Name, a.Value })
            .Except(el.Attributes().Select(a => new { a.Name, a.Value })).Count() > 0) {
                changed += " attributes";
        }
        if (!el2e.HasElements && el2e.Value != el.Value) {
            changed += " value";
        }
        el2e.SetAttributeValue("found", "found");
    }
    if (changed != "") el.SetAttributeValue("changed", changed.Trim());
}
doc.Save(out_A);
doc2.Save(out_B);

I have a generic differ class in the codeblocks library http://codeblocks.codeplex.com 我在代码块库http://codeblocks.codeplex.com中有一个通用的不同类

Loading your XML documents and treating each document as an IEnumerable (flattened XML tree) should allow you to use the differ as shown here: http://codeblocks.codeplex.com/wikipage?title=Differ%20Sample&referringTitle=Home 加载XML文档并将每个文档视为IEnumerable(扁平化的XML树),应允许您使用此处所示的不同之处: http : //codeblocks.codeplex.com/wikipage?title=Differ%20Sample&referringTitle=Home

Here's the source code for differ.cs: http://codeblocks.codeplex.com/SourceControl/changeset/view/96119#1887406 这是different.cs的源代码: http ://codeblocks.codeplex.com/SourceControl/changeset/view/96119#1887406

Diff prototype is: 差异原型是:

static IEnumerable<DiffEntry> Diff(IEnumerable<T> oldData, IEnumerable<T> newData, Comparison<T> identity, Comparison<T> different)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM