简体   繁体   中英

how to read this XML using LINQ to XML

I am new to LINQ to XML, and currently working with the following XML:

<invoices>
  <invoice>
    <order_id>85</order_id>
    <time>02:52 PM</time>
    <date>24-05-2013</date>
    <order>
      <item>
        <Main>
          <id>343</id>
          <Qty>1</Qty>
        </Main>
        <Add />
      </item>
      <item>
        <Main>
          <id>3</id>
          <Qty>1</Qty>
        </Main>
        <Add>
          <Extra id="1">
            <Qty>1</Qty>
            <Desc>Regular</Desc>
          </Extra>
        </Add>
      </item>
    </order>
  </invoice>
  <invoice>
    <order_id>88</order_id>
    <time>03:10 PM</time>
    <date>24-05-2013</date>
    <order>
      <item>
        <Main>
          <id>345</id>
          <Qty>1</Qty>
        </Main>
        <Add />
      </item>
      <item>
        <Main>
          <id>2</id>
          <Qty>2</Qty>
        </Main>
        <Add>
          <Extra id="1">
            <Qty>1</Qty>
            <Desc>Regular</Desc>
          </Extra>
        </Add>
      </item>
    </order>
  </invoice>
</invoices>

So far I have written the following code:

void queryData(XDocument doc)
{
        var data = from item in doc.Descendants("invoice")
                   select new
                   {
                       orderId = item.Element("order_id").Value,
                       orderDate = item.Element("date").Value,
                       orderTime = item.Element("time").Value
                   };
        foreach(var p in data)
            Console.WriteLine(p.ToString());

        //...

}

I having trouble reading the nested tags in the "order" tag. Also the element/tag "Add" sometimes has the "Extra" no. of tags/elements and sometimes not.

I don't have access to the code where this xml is generated so have to read this pattern.

So far I have tried working with grouping, but I am not able to work with 2nd and 3rd level elements.

After reading I would save these values to the database.

Thanks,

For the nested elements, just keep going with .Element("name") :

orderQuantities = item.Element("order").Elements("item")
    .Select(orderItem => new { 
        id = orderItem.Element("Main").Element("id")),
        qty = orderItem.Element("Main").Element("Qty"))
     }).ToArray(),

For the elements that you are not sure exist, you can always write a helper method:

extraQty = GetExtra(item),

Where GetExtra would be something like:

public int GetExtra(XElement element)
{
    XElement extra = element.Element("Add").Element("Extra");
    if (extra != null) return int.Parse(extra.Element("Qty").Value);
    else return 0;
}

(Needs more error handling of course, but you get the idea.)

Let me know if I am off by something, I didn't get a chance to test this, and also had to assume some of the elements where going to be duplicated

var data = from item in doc.Descendants ( "invoice" )
    select new {
        orderId = item.Element ( "order_id" ).Value ,
        orderDate = item.Element ( "date" ).Value ,
        orderTime = item.Element ( "time" ).Value ,
        items = 
            from order in item.Element ( "order" ).Descendants ( "item" )
            let main = order.Element ( "Main" )
            let adds = order.Elements ( "Add" )
            select new {
                Main = new {
                    id = main.Element ( "id" ).Value ,
                    Qty = main.Element ( "Qty" ).Value
                } ,
                Add = 
                (from add in adds
                    let extras = add.Elements ( "Extra" )
                    select new {
                                Extra = ( from extra in extras
                                        select new {
                                                extraId = extra.Attribute("id").Value,
                                                Qty = extra.Element ( "Qty" ).Value ,
                                                Desc = extra.Element ( "Desc" ).Value
                                            }).FirstOrDefault ( )
                            }).FirstOrDefault()
            }
};

Here is parsing of your xml:

var parser = new Parser();
XDocument xdoc = XDocument.Load(path_to_xml);
var orders = from invoice in xdoc.Root.Elements()
             select parser.ParseOrderFrom(invoice);

Thats all. I have created following classes. Order, which holds collection of order items and have nice parsed date:

public class Order
{
    public int Id { get; set; }
    public DateTime Date { get; set; }
    public List<OrderItem> Items { get; set; }
}

Order item, which is your main dish. Also it has list of extras inside (if any):

public class OrderItem
{
    public int Id { get; set; }
    public int Quantity { get; set; }
    public List<Extra> Extras { get; set; }
}

And extras class:

public class Extra
{
    public int Id { get; set; }
    public int Quantity { get; set; }
    public string Description { get; set; }
}

All parsing occurs in separate parser class, if you want (this will keep domain classes clean):

public class Parser
{
    public Order ParseOrderFrom(XElement invoice)
    {
        string time = (string)invoice.Element("time");
        string date = (string)invoice.Element("date");

        return new Order {
           Id = (int)invoice.Element("order_id"),
           Date = DateTime.ParseExact(date + time, "dd-MM-yyyyhh:mm tt", null),
           Items = invoice.Element("order").Elements("item")
                          .Select(i => ParseOrderItemFrom(i)).ToList()
        };
    }

    public OrderItem ParseOrderItemFrom(XElement item)
    {
        var main = item.Element("Main");

        return new OrderItem {
            Id = (int)main.Element("id"),
            Quantity = (int)main.Element("Qty"),
            Extras = item.Element("Add").Elements("Extra")
                         .Select(e => ParseExtraFrom(e)).ToList()
        };
    }

    public Extra ParseExtraFrom(XElement extra)
    {
        return new Extra {
            Id = (int)extra.Attribute("id"),
            Quantity = (int)extra.Element("Qty"),
            Description = (string)extra.Element("Desc")
        };
    }
}

Tested an working. This is impossible to do in one shot without defining some extra classes. Here I have a pivot-interface Item and then two classes which implement the interface Additem and MainItem .

Feel free to ask about an explanation on any portion.

// Since there are different types of items, we need an interface/abstact
// class to pivot.
public interface Item {
}

// The information neccesary for storing the 'Extra' element.
public class Extra {
    public Int32 ID { get; private set; }
    public Int32 Quantity { get; private set; }
    public String Description { get; private set; }

    public Extra(XElement extra) {

        // Here we load up all of the details from the 'extra' element
        this.ID = Int32.Parse(extra.Attribute("id").Value);
        this.Quantity = Int32.Parse(extra.Element("Qty").Value); ;
        this.Description = extra.Element("Desc").Value;
    }
}

// The 'add-item' is associated with the 'add' tag in the actual XML.
public class AddItem : Item {

    public IEnumerable<Extra> Extras { get; private set; }

    // The 'extras' is a collection of many items, so we require
    // an ienumerable.
    public AddItem(IEnumerable<Extra> extras) {
        this.Extras = extras;
    }

}

// The storage for the 'main-item'
public class MainItem : Item {
    public Int32 ID { get; private set; }
    public Int32 Quantity { get; private set; }

    public MainItem(Int32 id, Int32 quantity) {
        this.ID = id;
        this.Quantity = quantity;
    }
}

class Program {
    static void Main(string[] args) {
        String data = File.ReadAllText("File.txt");

        XElement tree = XElement.Parse(data);


        var projection = tree.Elements()
            .Select(invoice => new {
                // Project the main details of the invoice { OrderID, Time, Date, Order }
                // The order itself needs to be projected again though because it too is a 
                // collection of sub items.
                OrderID = invoice.Element("order_id").Value,
                Time = invoice.Element("time").Value,
                Date = invoice.Element("date").Value,
                Order = invoice.Element("order")
                    .Elements()
                    .Elements()
                    .Select(item => {

                        // First, we need to know what type of item this 'order' is.
                        String itemType = item.Name.ToString();

                        // If its a 'main' item, then return that type.
                        if (itemType == "Main") {
                            Int32 id = Int32.Parse(item.Element("id").Value);
                            Int32 quantity = Int32.Parse(item.Element("Qty").Value);

                            return (Item)new MainItem(id, quantity);
                        }

                        // If it's an 'Add' item. Then we have to:
                        if (itemType == "Add") {
                            // (1) Capture all of the extras.
                            IEnumerable<Extra> extras = item.Elements()
                                .Select(extra => new Extra(extra))
                                .ToList();

                            // (2) Add the extras to a new AddItem. Then return the 'add'-item.
                            // Notice that we have to cast to 'Item' because we are returning 
                            // a 'Main'-item sometimes and an 'add' item other times.
                            // Select requires the return type to be the same regardless.
                            return (Item)new AddItem(extras);
                        }

                        // Hopefully this path never hits.
                        throw new NotImplementedException("This path not defined");

                    }).ToList()

            }).ToList();

        Console.WriteLine(projection);
    }
}

You can make things more manageable if you use some xpath in your query. Using pure LINQ to XML here can get too verbose if you asked me.

var query =
    from invoice in doc.XPathSelectElements("/invoices/invoice")
    select new
    {
        OrderId = (int)invoice.Element("order_id"),
        Time = (string)invoice.Element("time"),
        Date = (string)invoice.Element("date"),
        Items =
            from item in invoice.XPathSelectElements("./order/item")
            select new
            {
                Id = (int)item.XPathSelectElement("./Main/id"),
                Quantity = (int)item.XPathSelectElement("./Main/Qty"),
                Extras =
                    from extra in item.XPathSelectElements("./Add/Extra")
                    select new
                    {
                        Id = (int)extra.Attribute("id"),
                        Quantity = (int)extra.Element("Qty"),
                        Description = (string)extra.Element("Desc"),
                    },
            },
    };

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM