简体   繁体   中英

select the nodes that pass the selection of two subnodes with Linq in C#

EDIT As per the advice of @Gert Arnold I decided to edit and more thoroughly format my question.

I've been trying to select nodes via Linq that pass the id and value conditions. In my case I need the series that have nodes with two specific value attributes within the SeriesKey node.

Here's my XML string (FYI if you spot any markup mistakes, they might be due to my indentation mistakes, the original file is XML valid )

<message:GenericData xmlns:footer="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message/footer" 
                     xmlns:generic="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic" 
                     xmlns:message="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" 
                     xmlns:common="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" 
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<message:DataSet>
  <generic:Series>
      <generic:SeriesKey>
           <generic:Value id="GEO" value="124"/>
           <generic:Value id="PRODUCT" value="4400"/>
           <generic:Value id="FIN" value="03"/>
           <generic:Value id="ENERGY_UNITS" value="WSR"/>
      </generic:SeriesKey>
      <generic:Obs>
          <generic:ObsDimension id="TIME_PERIOD" value="1999"/>
          <generic:ObsValue value="0"/>
          <generic:Attributes>
              <generic:Value id="UNIT_SUFFIX" value="R"/>
          </generic:Attributes>
      </generic:Obs>
      <generic:Obs>
          <generic:ObsDimension id="TIME_PERIOD" value="2000"/>
          <generic:ObsValue value="0"/>
      <generic:Attributes>
      <generic:Value id="UNIT_SUFFIX" value="R"/>
      </generic:Attributes>
      </generic:Obs>
 </generic:Series>
 <generic:Series>
     <generic:SeriesKey>
         <generic:Value id="GEO" value="124"/>
         <generic:Value id="PRODUCT" value="4100"/>
         <generic:Value id="FIN" value="03"/>
         <generic:Value id="ENERGY_UNITS" value="WSR"/>
     </generic:SeriesKey>
     <generic:Obs>
         <generic:ObsDimension id="TIME_PERIOD" value="1999"/>
         <generic:ObsValue value="8246"/>
         <generic:Attributes>
             <generic:Value id="UNIT_SUFFIX" value="R"/>
         </generic:Attributes>
     </generic:Obs>
     <generic:Obs>
         <generic:ObsDimension id="TIME_PERIOD" value="2000"/>
         <generic:ObsValue value="40733"/>
         <generic:Attributes>
             <generic:Value id="UNIT_SUFFIX" value="R"/>
         </generic:Attributes>
     </generic:Obs>
   </generic:Series>
   <generic:Series>
       <generic:SeriesKey>
           <generic:Value id="GEO" value="124"/>
           <generic:Value id="PRODUCT" value="4200"/>
           <generic:Value id="FIN" value="03"/>
           <generic:Value id="ENERGY_UNITS" value="WSR"/>
       </generic:SeriesKey>
       <generic:Obs>
           <generic:ObsDimension id="TIME_PERIOD" value="1999"/>
           <generic:ObsValue value="279"/>
           <generic:Attributes>
               <generic:Value id="UNIT_SUFFIX" value="R"/>
           </generic:Attributes>
       </generic:Obs>
       <generic:Obs>
           <generic:ObsDimension id="TIME_PERIOD" value="2000"/>
           <generic:ObsValue value="324"/>
           <generic:Attributes>
               <generic:Value id="UNIT_SUFFIX" value="R"/>
           </generic:Attributes>
       </generic:Obs>
    </generic:Series>
</message:DataSet>
</message:GenericData>

I tried going the query way and just create a series of steps with logical operators as you can see in the where statement. I've enclosed the method in question. At this point it accepts an xml string (one above) and two filtering criteria, namely EnergyProduct to filter the PRODUCT attribute and EconSector to filter the FIN attribute.

    public IEnumerable<XElement> DataSetFilter(string XmlString, string EnergyProduct, string EconSector)
    {
        XDocument sdmx_response = XDocument.Parse(XmlString);
        XNamespace message = "http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message";
        XNamespace generic = "http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic";
        IEnumerable<XElement> DataSet = sdmx_response.Root.Elements(message + "DataSet");
        IEnumerable<XElement> Series = from series in DataSet.Elements(generic + "Series")
                     from serieskey in series.Elements(generic + "SeriesKey")
                     from value in serieskey.Elements(generic + "Value")
                     where 
                     (
                         (string)value.Attribute("id") == "PRODUCT" && (string)value.Attribute("value") == EnergyProduct
                     ) || 
                     (
                         (string)value.Attribute("id") == "FIN" && (string)value.Attribute("value") == EconSector
                     )
                     select serieskey;
        IEnumerable <XElement> observationsSet = from observations in Series.Elements(generic + "Obs").Elements(generic + "ObsValue") select observations;
        return observationsSet;
    }

The problem is that it grabs all data for both Attributes, for example the ones that match the PRODUCT code "4400" and FIN code "03" and what I'm looking for is just the nodes that contain the subnodes with those exact values, both in the same SeriesKey . I was thinking of creating a anonymous object that comprises the xml elements I want in question but I got errors and I'm still confused how to properly implement that. Thank you for all your help!

try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XDocument sdmx_response = XDocument.Load(FILENAME);
            XNamespace message = sdmx_response.Root.GetDefaultNamespace();
            XNamespace generic = sdmx_response.Root.GetDefaultNamespace();

            IEnumerable<XElement> DataSet = sdmx_response.Root.Elements(message + "DataSet");
            IEnumerable<XElement> Series = DataSet.Elements(generic + "Series").Select(series => new XElement("Series", new object[] {
                new XElement("SeriesKey", 
                    series.Elements(generic + "SeriesKey").Elements("Value").Where(value =>((string)value.Attribute("id") == "PRODUCT" && (string)value.Attribute("value") == "Lumber") || ((string)value.Attribute("id") == "FIN" && (string)value.Attribute("export") == "Lumber"))
                    ),
                series.Elements(generic + "Obs")
            })).ToList();

        }
    }



}

I have upvoted and selected jdweng's answer as the most appropriate solution. This is my code.

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            IEnumerable<XElement> NormalizedDataSet = NormalizeGeneric(FILENAME);
            foreach (XElement Series in NormalizedDataSet)
            {
                Console.WriteLine(Series);
            }
        }

        public IEnumerable<XElement> NormalizeGeneric(string XmlString)
        {
            XDocument xml_response = XDocument.Parse(XmlString);
            XNamespace message = "http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message";
            XNamespace generic = "http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic";
            XElement SeriesSet = xml_response.Root;
            IEnumerable<XElement> SeriesObject = seriesSet.Elements(message + "DataSet")
                                                          .Elements(generic + "Series")
                                                          .Select(series => new XElement("Series", new object[]
            {
                new XElement("Metadata", 
                            series.Elements(generic + "SeriesKey")
                                  .Elements(generic + "Value")
                                  .Select(value => new XElement((string)value.Attribute("id"), new XAttribute("value", (string)value.Attribute("value"))))),
                new XElement("Data", 
                            series.Elements(generic + "Obs")
                                  .Select(observations => new XElement("Observation", new XAttribute((string)observations.Element(generic + "ObsDimension")
                                                                                                                         .Attribute("id"), (string)observations.Element(generic + "ObsDimension").Attribute("value")), new XAttribute("value", (string)observations.Element(generic + "ObsValue").Attribute("value")), new XElement("Attributes", observations.Elements(generic + "Attributes").Elements(generic + "Value").Select(attributes => new XElement((string)attributes.Attribute("id"), new XAttribute("value", (string)attributes.Attribute("value"))))))))
            })).ToArray();
            return SeriesObject;
        }
    }
}

The difference between my code and jdweng's is, I added also the Data portion of the file containing the actual numbers. The 'normalization' of the dataset is unavoidable so that it is easier to manipulate the value and filter the necessary nodes. Thank you and apologies for the tardy response and the indentation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM