[英]How do I extract info deep inside XML using C# and LINQ?
這是我在StackOverflow上的第一篇文章,所以請耐心等待。 如果我的代碼示例有點長,我會提前道歉。
使用C#和LINQ,我試圖在更大的XML文件中識別一系列第三級id
元素(在本例中為000049)。 每個第三級id
都是唯一的,我想要的是基於每個的一系列后代信息。 更具體地說,如果type == A
和location type(old) == vault
和location type(new) == out
,那么我想選擇那個id
。 下面是我正在使用的XML和C#代碼。
一般來說,我的代碼有效。 如下所示,它將返回兩次id
為000049,這是正確的。 但是,我發現了一個小故障。 如果我刪除包含type == A
的第一個history
塊,我的代碼仍然會返回兩次id
000049,它應該只返回一次。 我知道它為什么會發生,但我無法找到更好的方法來運行查詢。 有沒有更好的方法來運行我的查詢來獲得我想要的輸出並仍然使用LINQ?
我的XML:
<?xml version="1.0" encoding="ISO8859-1" ?>
<data type="historylist">
<date type="runtime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>15</hour>
<minutes>24</minutes>
<seconds>46</seconds>
</date>
<customer>
<id>0001</id>
<description>customer</description>
<mediatype>
<id>kit</id>
<description>customer kit</description>
<volume>
<id>000049</id>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>03</hour>
<minutes>00</minutes>
<seconds>02</seconds>
</date>
<userid>batch</userid>
<type>OD</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>43</minutes>
<seconds>33</seconds>
</date>
<userid>vaultred</userid>
<type>A</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>43</minutes>
<seconds>33</seconds>
</date>
<userid>vaultred</userid>
<type>S</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>45</minutes>
<seconds>00</seconds>
</date>
<userid>batch</userid>
<type>O</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>11</hour>
<minutes>25</minutes>
<seconds>59</seconds>
</date>
<userid>ihcmdm</userid>
<type>A</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>11</hour>
<minutes>25</minutes>
<seconds>59</seconds>
</date>
<userid>ihcmdm</userid>
<type>S</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
</volume>
...
我的C#代碼:
IEnumerable<XElement> caseIdLeavingVault =
from volume in root.Descendants("volume")
where
(from type in volume.Descendants("type")
where type.Value == "A"
select type).Any() &&
(from locationOld in volume.Descendants("location")
where
((String)locationOld.Attribute("type") == "old" &&
(String)locationOld.Element("repository") == "vault") &&
(from locationNew in volume.Descendants("location")
where
((String)locationNew.Attribute("type") == "new" &&
(String)locationNew.Element("repository") == "out")
select locationNew).Any()
select locationOld).Any()
select volume.Element("id");
...
foreach (XElement volume in caseIdLeavingVault)
{
Console.WriteLine(volume.Value.ToString());
}
謝謝。
好的伙計們,我再次難過了。 鑒於同樣的情況和@ Elian的解決方案(效果很好),我需要用於選擇id
的history
的"optime"
和"movedate"
日期。 那有意義嗎? 我希望以這樣的結局結束:
select new {
id = volume.Element("id").Value,
// this is from "optime"
opYear = <whaterver>("year").Value,
opMonth = <whatever>("month").Value,
opDay = <whatever>("day").Value,
// this is from "movedate"
mvYear = <whaterver>("year").Value,
mvMonth = <whatever>("month").Value,
mvDay = <whatever>("day").Value
}
我已經嘗試了許多不同的組合,但Attribute
S代表<date type="optime">
和<date type="movedate">
繼續得到我的方式,我似乎無法得到我想要的東西。
好。 我找到了一個效果很好的解決方案 :
select new {
caseId = volume.Element("id").Value,
// this is from "optime"
opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
// this is from "movedate"
mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value,
mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value,
mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value
};
但是,當它找到沒有"movedate"
的id
時它會失敗。 其中一些存在,所以現在我正在努力。
好吧,昨天下午晚些時候我終於想出了我一直想要的解決方案:
var caseIdLeavingSite =
from volume in root.Descendants("volume")
where volume.Elements("history").Any(
h => h.Element("type").Value == "A" &&
h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") ||
(l.Element("repository").Value == "init"))) &&
h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault")
)
select new {
caseId = volume.Element("id").Value,
opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0",
mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0",
mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0"
};
這滿足了@Elian幫助的要求並獲取了必要的附加日期信息。 當使用三元運算符沒有"movedate"
元素時,它也會解釋那幾個實例?:
。
現在,如果有人知道如何提高效率,我仍然感興趣。 謝謝。
我想你想要這樣的東西:
IEnumerable<XElement> caseIdLeavingVault =
from volume in document.Descendants("volume")
where volume.Elements("history").Any(
h => h.Element("type").Value == "A" &&
h.Elements("location").Any(l => l.Attribute("type").Value == "old" && l.Element("repository").Value == "vault") &&
h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "out")
)
select volume.Element("id");
您的代碼獨立地檢查是否一個卷具有<history>
類型的元件A
和(不一定相同) <history>
元件,其具有所要求的<location>
元素。
上面的代碼檢查是否存在類型為A
且包含所需<location>
元素的<history>
<location>
元素。
更新: Abatishchev建議使用xpath查詢而不是LINQ to XML的解決方案,但他的查詢過於簡單,並且不能完全返回您要求的內容。 以下的xpath查詢可以解決問題,但它也有點長:
data/customer/mediatype/volume[history[type = 'A' and location[@type = 'old' and repository = 'vault'] and location[@type = 'new' and repository = 'out']]]/id
當您可以使用簡單的XPath查詢時,如何使用這種復雜且昂貴的LINQ to XML查詢:
using System.Xml;
string xml = @"...";
string xpath = "data/customer/mediatype/volume/history/type[text()='A']/../location[@type='old' or @type='new']/../../id";
var doc = new XmlDocument();
doc.LoadXml(xml); // or use Load(path);
var nodes = doc.SelectNodes(xpath);
foreach (XmlNode node in nodes)
{
Console.WriteLine(node.InnerText); // 000049
}
或者如果您不需要XML DOM模型:
using System.Xml.XPath;
XPathDocument doc = null;
using (var stream = new StringReader(xml))
{
doc = new XPathDocument(stream); // specify just path to file if you have such one
}
var nav = doc.CreateNavigator();
XPathNodeIterator nodes = (XPathNodeIterator)nav.Evaluate(xpath);
foreach (XPathNavigator node in nodes)
{
Console.WriteLine(node.Value);
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.