简体   繁体   中英

Parse XML with LINQ to get child elements

<?xml version="1.0" standalone="yes"?>
<CompanyInfo>
     <Employee name="Jon" deptId="123">
      <Region name="West">
        <Area code="96" />
      </Region>
      <Region name="East">
        <Area code="88" />
      </Region>
     </Employee>
</CompanyInfo>  

public class Employee
{
    public string EmployeeName { get; set; }
    public string DeptId { get; set; }
    public List<string> RegionList {get; set;}
}

public class Region
{
    public string RegionName { get; set; }
    public string AreaCode { get; set; }
}

I am trying to read this XML data, so far I have tried this:

XDocument xml = XDocument.Load(@"C:\data.xml");
var xElement = xml.Element("CompanyInfo");
if (xElement != null)
    foreach (var child in xElement.Elements())
    {
        Console.WriteLine(child.Name);  
        foreach (var item in child.Attributes())
        {
            Console.WriteLine(item.Name + ": " + item.Value);
        }

        foreach (var childElement in child.Elements())
        {
            Console.WriteLine("--->" + childElement.Name);
            foreach (var ds in childElement.Attributes())
            {
                Console.WriteLine(ds.Name + ": " + ds.Value);
            }
            foreach (var element in childElement.Elements())
            {
                Console.WriteLine("------->" + element.Name);
                foreach (var ds in element.Attributes())
                {
                    Console.WriteLine(ds.Name + ": " + ds.Value);
                }
            }
        }                
    }

This enables me to get each node, its attribute name and value and so I can save these data into the relevant field in database, but this seems a long winded way and not flexible, for instance if the XML structure changes all those foreach statements needs revisiting, also it is difficult to filter the data this way, I need to write certain if statements to filter the data (eg get employees from West only etc...)

I was looking for a more flexible way, using linq, something like this:

List<Employees> employees =
              (from employee in xml.Descendants("CompanyInfo")
               select new employee
               {
                   EmployeeName = employee.Element("employee").Value,
                   EmployeeDeptId = ?? get data,
                   RegionName = ?? get data,
                   AreaCode = ?? get data,,
               }).ToList<Employee>();

But I am not sure how I can get the values from the child nodes and apply the filtering (to get the certain employees only). Is this possible? Any help is appreciated.

Thanks

var employees = (from e in xml.Root.Elements("Employee")
                 let r = e.Element("Region")
                 where (string)r.Attribute("name") == "West"
                 select new Employee
                 {
                     EmployeeName = (string)e.Attribute("employee"),
                     EmployeeDeptId = (string)e.Attribute("deptId"),
                     RegionName = (string)r.Attribute("name"),
                     AreaCode = (string)r.Element("Area").Attribute("code"),
                 }).ToList();

But it will still require query revision when XML file structure changes.

Edit

Query for multiple regions per employee:

var employees = (from e in xml.Root.Elements("Employee")
                 select new Employee
                 {
                     EmployeeName = (string)e.Attribute("employee"),
                     DeptId = (string)e.Attribute("deptId"),
                     RegionList = e.Elements("Region")
                                   .Select(r => new Region {
                                       RegionName = (string)r.Attribute("name"),
                                       AreaCode = (string)r.Element("Area").Attribute("code")
                                   }).ToList()
                 }).ToList();

You can then filter the list for employees from given region only:

var westEmployees = employees.Where(x => x.RegionList.Any(r => r.RegionName == "West")).ToList();

You can track the structure:

from employee in xml
      .Element("CompanyInfo")       // must be root
      .Elements("Employee")         // only directly children of CompanyInfo

or less strictly

from employee in xml.Descendants("Employee")    // all employees at any level

And then get the information you want:

       select new Employee
       {
           EmployeeName = employee.Attribute("name").Value,
           EmployeeDeptId = employee.Attribute("deptId").Value,
           RegionName = employee.Element("Region").Attribute("name").Value,
           AreaCode = employee.Element("Region").Element("Area").Attribute("code").Value,
       }

And with the additional info about multiple regions, assuming a List<Region> Regions property:

       select new Employee
       {
           EmployeeName = employee.Attribute("name").Value,
           EmployeeDeptId = employee.Attribute("deptId").Value,
           //RegionName = employee.Element("Region").Attribute("name").Value,
           //AreaCode = employee.Element("Region").Element("Area").Attribute("code").Value,
           Regions = (from r in employee.Elements("Region") select new Region 
                      {
                         Name = r.Attribute("name").Value,
                         Code = r.Element("Area").Attribute("code").Value,
                      }).ToList();
       }

You can do the selection in one query and then the filtering in second or combine them both to one query:

Two queries:

        // do te transformation
        var employees =
          from employee in xml.Descendants("CompanyInfo").Elements("Employee")
          select new
          {
              EmployeeName = employee.Attribute("name").Value,
              EmployeeDeptId = employee.Attribute("deptId").Value,
              Regions = from region in employee.Elements("Region")
                        select new
                            {
                                Name = region.Attribute("name").Value,
                                AreaCode = region.Element("Area").Attribute("code").Value,
                            }
          };

        // now do the filtering
        var filteredEmployees = from employee in employees
                                from region in employee.Regions
                                where region.AreaCode == "96"
                                select employee;

Combined one query (same output):

          var employees2 =
          from selectedEmployee2 in
              from employee in xml.Descendants("CompanyInfo").Elements("Employee")
              select new
              {
                  EmployeeName = employee.Attribute("name").Value,
                  EmployeeDeptId = employee.Attribute("deptId").Value,
                  Regions = from region in employee.Elements("Region")
                            select new
                                {
                                    Name = region.Attribute("name").Value,
                                    AreaCode = region.Element("Area").Attribute("code").Value,
                                }
              }
          from region in selectedEmployee2.Regions
          where region.AreaCode == "96"
          select selectedEmployee2;

But there is one little thing you should consider adding. For robustness, you need to check existence of your elements and attributes then the selection will look like that:

 var employees =
          from employee in xml.Descendants("CompanyInfo").Elements("Employee")
          select new
          {
              EmployeeName = (employee.Attribute("name") != null) ? employee.Attribute("name").Value : string.Empty,
              EmployeeDeptId = (employee.Attribute("deptId") != null) ? employee.Attribute("deptId").Value : string.Empty,
              Regions = (employee.Elements("Region") != null)?
                        from region in employee.Elements("Region")
                        select new
                            {
                                Name = (region.Attribute("name")!= null) ? region.Attribute("name").Value : string.Empty,
                                AreaCode = (region.Element("Area") != null && region.Element("Area").Attribute("code") != null) ? region.Element("Area").Attribute("code").Value : string.Empty,
                            }
                        : null
          };

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM