[英]C# LINQ xml parsing using “PreviousNode”
With quite some help from SO, I managed to put together the following LINQ expression. 在SO的帮助下,我设法将以下LINQ表达式组合在一起。
var parentids = xliff.Descendants()
.Elements(xmlns + "trans-unit")
.Elements(xmlns + "seg-source")
.Elements(xmlns + "mrk")
.Where(e => e.Attribute("mtype").Value == "seg")
.Select(item => (XElement)item.Parent.Parent.PreviousNode)
.Where(item => item != null)
.Select(item => item.Elements(xmlns + "source")
.Where(itema => itema != null)
.Select(itemb => itemb.Elements(xmlns + "x")
.LastOrDefault()
.Attribute("id")
.Value.ToString())).ToArray();
What it does is that it locates a mrk
tag (that has @mtype="seg"
) and then it goes up to the trans-unit
ancestor (.parent.parent) and checks if the previous sibling trans-unit
has a child trans
and if not, it returns from the source
child the @id
of the last x
element, otherwise it returns null
(it must return null, cannot just not return match). 它的作用是找到一个
mrk
标签(有@mtype="seg"
),然后它转到trans-unit
祖先(.parent.parent)并检查前一个兄弟trans-unit
是否有一个子trans
如果没有,它从source
子@id
返回最后一个x
元素的@id
,否则返回null
(它必须返回null,不能只返回匹配)。
I need to add that while the below samples only have one such previous node with no trans
element, in the real life xml there are many more, so I must use PreviousNode
. 我需要补充说,虽然下面的示例只有一个这样的前一个节点没有
trans
元素,但在现实生活中xml还有更多,所以我必须使用PreviousNode
。
Here is the XML it works with, and returns "2"
perfectly: 这是它使用的XML,并完美地返回
"2"
:
<?xml version="1.0" encoding="utf-8"?>
<xliff xmlns:sdl="http://sdl.com/FileTypes/SdlXliff/1.0" version="1.2" sdl:version="1.0" xmlns="urn:oasis:names:tc:xliff:document:1.2">
<file original="Pasadena_Internet_2016.xml" source-language="en-US" datatype="x-sdlfilterframework2" target-language="da-DK">
<body>
<trans-unit id="d679cb2d-ecba-47ba-acb7-1bb4a798c755" translate="no">
<source>
<x id="0" />
<x id="1" />
<x id="2" />
</source>
</trans-unit>
<trans-unit id="aed9fde2-fd1b-4eba-bfc9-06d325aa7047">
<source>
<x id="3" />Pasadena, California’s iconic Colorado Boulevard <x id="4" />has been the site of the world-famous Tournament of Roses Parade since it began in 1890.
</source>
<seg-source>
<mrk mtype="seg" mid="1">
<x id="3" />Pasadena, California’s iconic Colorado Boulevard <x id="4" />has been the site of the world-famous Tournament of Roses Parade since it began in 1890.
</mrk>
</seg-source>
<target>
<mrk mtype="seg" mid="1">
<x id="3" /><x id="4" />Pasadena, Californiens ikoniske Colorado Boulevard har været stedet for den verdensberømte Rose Bowl-parade siden den begyndte i 1890.
</mrk>
</target>
</trans-unit>
</body>
</file>
</xliff>
The problem is that I need to solve as a last step is that there is another type of XML that has the staring trans-unit
encapsulated within another group
element that is not present in the other XML. 问题是我需要解决的最后一步是,还有另一种类型的XML,它将凝视的
trans-unit
封装在另一个group
元素中,而另一个group
元素中没有其他XML。 So here there is one more parent to jump upwards and get the previous trans-unit
sibling, right before the group
. 所以这里还有一个父母向上跳,并在
group
之前获得之前的trans-unit
兄弟。
I am trying to build this into the same LINQ expression so it handles both scenarios. 我正在尝试将其构建到相同的LINQ表达式中,以便它处理这两种情况。
In fact if I modify the line 6 to this, then it works: 事实上,如果我修改第6行,那么它的工作原理是:
.Select(item => (XElement)item.Parent.Parent.Parent.PreviousNode)
<!-- ^------ additional Parent -->
Here is the other XML that right now throws an exception with the above code, but it should return "0"
: 这是现在使用上面的代码抛出异常的另一个XML,但它应该返回
"0"
:
<?xml version="1.0" encoding="utf-8"?>
<xliff xmlns:sdl="http://sdl.com/FileTypes/SdlXliff/1.0" xmlns="urn:oasis:names:tc:xliff:document:1.2" version="1.2" sdl:version="1.0">
<file original="Internet_Anti-DrugIntro2015.xml_1457007.xlf" datatype="x-sdlfilterframework2" source-language="en-US" target-language="hu-HU">
<body>
<trans-unit translate="no" id="c3a13bfb-ed51-49cf-8278-e2c86c2114c0">
<source>
<x id="0"/>
</source>
</trans-unit>
<group>
<sdl:cxts>
<sdl:cxt id="1"/>
</sdl:cxts>
<trans-unit id="3b4520df-4483-4c9e-8a9b-ce2544269f3e">
<source>
<x id="1"/>
</source>
<seg-source>
<mrk mtype="seg" mid="2">
<x id="1"/>Drugs are robbing our children of their future.
</mrk>
<mrk mtype="seg" mid="3">
<x id="2"/>Every 17 seconds a teenager experiments with an illicit drug for the first time.
</mrk>
</seg-source>
<target>
<mrk mtype="seg" mid="2">
<x id="1"/>A drogok megfosztják gyermekeinket a jövőjüktől.
</mrk>
<mrk mtype="seg" mid="3">
<x id="2"/>17 másodpercenként egy újabb tizenéves próbálja ki először a kábítószereket.
</mrk>
</target>
</trans-unit>
</group>
<trans-unit translate="no" id="7890462c-edcb-4fe6-9192-033ba76d9942">
<source>
<x id="183"/>
</source>
</trans-unit>
</body>
</file>
</xliff>
I will be more than appreciative for any help. 我会非常感谢任何帮助。
Instead of navigating up the XML tree using Parent
several times depending on the XML structure, you can try using Ancestors().Last()
to find the highest level ancestor named either "trans-unit"
or "group"
, and then navigate to the previous node. 您可以尝试使用
Ancestors().Last()
来查找名为"trans-unit"
或"group"
的最高级祖先,而不是使用Parent
多次导航XML树,具体取决于XML结构,然后导航到上一个节点。
Try to replace this part : 尝试替换此部分:
.Select(item => (XElement) item.Parent.Parent.PreviousNode)
with this one : 这一个:
.Select(item => (XElement)item.Ancestors()
.Last(o => new[]{"trans-unit","group"}.Contains(o.Name.LocalName))
.PreviousNode)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.