简体   繁体   English

使用Linq查询XML树

[英]Querying XML tree with Linq

I'm trying to parse out an complex XML file using LINQ. 我正在尝试使用LINQ解析出复杂的XML文件。 The files contains thousands of records, each with hundreds of fields. 这些文件包含数千条记录,每条记录包含数百个字段。 I need to parse out certain parts of information about each drug and store it in a database. 我需要解析有关每种药物的某些信息部分并将其存储在数据库中。

Edit: I'm very sorry all, but the originally posted XML was in fact not accurate. 编辑:我很抱歉,但最初发布的XML实际上并不准确。 I was unaware of the fact that the attributes would alter the process. 我没有意识到这些属性会改变这个过程。 I've updated the question to accurately portray the true nature of XML file. 我已经更新了问题,以准确描述XML文件的真实性质。

Here's a sample of the XML: 这是XML的示例:

<<drugs xmlns:xs="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://drugbank.ca" xs:schemaLocation="http://www.drugbank.ca/docs/drugbank.xsd" schemaVersion="1.4">
   <drug>
      <name>foo</name>
      <indication>Some info here</indication>
      <half-life>1 to 3 hours</half-life>
      <protein-binding>90%</protein-binding>
        // hundreds of other elements
      <properties>
         <property>
            <kind>logP/hydrophobicity</kind>
            <value>-0.777</value>
         </property>
         <property>
            <kind>Molecular Weight</kind>
            <value>6963.4250</value>
         </property>
         <property>
            <kind>Molecular Formula</kind>
            <value>C287H440N80O110S6</value>
         </property>
         //dozens of other properties
      </properties>
   </drug>
   // thousands of more drugs
</drugs>

I'm pretty fuzzy on the actual querying, as this is my first time working with LINQ. 我对实际的查询非常模糊,因为这是我第一次使用LINQ。 I'm familiar with SQL, so the concept of complex queries aren't difficult for me, but I haven't been able to find any documentation that I can understand that helps with this issue. 我熟悉SQL,所以复杂查询的概念对我来说并不困难,但我找不到任何可以帮助解决这个问题的文档。 The query that I have so far is as follows: 我到目前为止的查询如下:

XDocument xdoc = XDocument.Load(@"drugbank.xml");

var d = from drugs in xdoc.Descendants("drug")
                        select new
                        {
                            name = drugs.Element("name").Value,
                            indication = drugs.Element("indication").Value,
                            halflife = drugs.Element("half-life").Value,
                            proteinBinding = drugs.Element("protein-binding").Value,
                        };

The first issue is (theoretically) resolved. 第一个问题是(理论上)解决了。 On to... 到......

The second issue is the fact that I need to extract some of the properties (namely, hydrophobicity, molecular weight, and molecular formula), but where I'm confused is that the property kind and property value are stored in two different XElements. 第二个问题是我需要提取一些属性(即疏水性,分子量和分子式),但我感到困惑的是属性种类和属性值存储在两个不同的XElements中。 How can I get the property values restricted to the fields that I care about? 如何才能将属性值限制在我关注的字段中?

I've pasted your code: output: 我已粘贴你的代码:输出:

foo
Some info here
1 to 3 hours
90%

just as expected 正如预期的那样

You can do a subquery to get the properties into another property of the outer generic object. 您可以执行子查询以将属性获取到外部通用对象的另一个属性中。 If you want them nested: 如果你想要它们嵌套:

XNamespace defaultNS = "http://drugbank.ca";

var d = from drugs in xdoc.Descendants(defaultNS + "drug")
        select new
        {
            name = drugs.Element(defaultNS + "name").Value,
            indication = drugs.Element(defaultNS + "indication").Value,
            halflife = drugs.Element(defaultNS + "half-life").Value,
            proteinBinding = drugs.Element(defaultNS + "protein-binding").Value,
            Properties = (from property in drugs.Element(defaultNS + "properties").Elements(defaultNS + "property")
                          let kind = property.Element(defaultNS + "kind").Value
                          where kind == "logP/hydrophobicity" || kind == "Molecular Weight" || kind == "Molecular Formula"
                          select new { Kind = kind, Value = property.Element(defaultNS + "value").Value })
        };

Or flattened: 或扁平化:

XNamespace defaultNS = "http://drugbank.ca";

var d = from drugs in xdoc.Descendants(defaultNS + "drug")
        let properties = drugs.Element(defaultNS + "properties").Elements(defaultNS + "property")
        select new
        {
            name = drugs.Element(defaultNS + "name").Value,
            indication = drugs.Element(defaultNS + "indication").Value,
            halflife = drugs.Element(defaultNS + "half-life").Value,
            proteinBinding = drugs.Element(defaultNS + "protein-binding").Value,
            hydrophobicity = (from property in properties
                          let kind = property.Element(defaultNS + "kind").Value
                          where kind == "logP/hydrophobicity"
                          select property.Element(defaultNS + "value").Value).FirstOrdefaultNS(),
            molecularWeight = (from property in properties
                          let kind = property.Element(defaultNS + "kind").Value
                          where kind == "Molecular Weight" || kind == "Molecular Formula"
                          select property.Element(defaultNS + "value").Value).FirstOrdefaultNS(),
            molecularFormula = (from property in properties
                          let kind = property.Element(defaultNS + "kind").Value
                          where kind == "Molecular Formula"
                          select property.Element(defaultNS + "value").Value).FirstOrdefaultNS()
        };

Also, a very useful reference that can help you learn about Linq is 101 LINQ Samples . 此外,一个非常有用的参考,可以帮助您了解Linq是101 LINQ样本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM