简体   繁体   中英

C# LinQ to XML create output

What I'm doing here is converting the omnipage xml to alto xml. So I decided to use C#.

And here is my sample XML file

<wd l="821" t="283" r="1363" b="394">
<ch l="821" t="312" r="878" b="394" conf="158">n</ch>
<ch l="888" t="312" r="950" b="394" conf="158">o</ch>
<ch l="955" t="283" r="979" b="394" conf="158">i</ch>
<ch l="989" t="312" r="1046" b="394" conf="158">e</ch>
<ch l="1051" t="312" r="1147" b="394" conf="158">m</ch>
<ch l="1157" t="283" r="1219" b="394" conf="158">b</ch>
<ch l="1224" t="312" r="1267" b="394" conf="198">r</ch>
<ch l="1267" t="283" r="1296" b="394" conf="198">i</ch>
<ch l="1306" t="312" r="1363" b="394" conf="158">e</ch>
</wd>

And here is my code

XDocument document = XDocument.Load(fileName);
var coordinates = from r in document.Descendants("wd").ToList().Where
                  (r => (string)r.Attribute("l") != "")
                  select new
                  {
                      left = r.Attribute("l").Value,
                  };

foreach (var item in coordinates)
{
    Console.WriteLine(item.left);
}
Console.ReadLine();

My question is, it works when I use a simple XML like in the above, but when I use a long XML like this in the link

http://pastebin.com/LmDHRzC5

it doesn't work?

But it also has a wd tag and it also has a L attribute.

Thank you. I paste the long XML in the pastebin because its too long.

You have a namespace on your larger document

<document xmlns="http://www.scansoft.com/omnipage/xml/ssdoc-schema3.xsd"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

the following works

document.Descendants().Where(e => e.Name.LocalName == "wd")

Or you can use another option from Search XDocument using LINQ without knowing the namespace

I'm not going to do all the code but this should get you started. I used xml linq

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            StreamReader reader = new StreamReader(FILENAME);
            //skip xml identification with UTF-16
            reader.ReadLine();
            XDocument doc = XDocument.Load(reader);

            XElement body = doc.Descendants().Where(x => x.Name.LocalName == "body").FirstOrDefault();
            XNamespace ns = body.GetDefaultNamespace();

            var results = new {
                sections = body.Elements(ns + "section").Select(x => new {
                    l = (int)x.Attribute("l"),
                    r = (int)x.Attribute("r"),
                    b = (int)x.Attribute("b"),
                    runs = x.Descendants(ns + "run").Select(y => new {
                        wds = y.Elements(ns + "wd").Select(z => new {
                            chs = z.Elements(ns + "ch").Select(a => new {
                                l = (int?)a.Attribute("l"),
                                t = (int?)a.Attribute("t"),
                                r = (int?)a.Attribute("r"),
                                b = (int?)a.Attribute("b"),
                                conf = (int?)a.Attribute("conf"),
                                value = (string)a
                            }).ToList()
                        }).ToList()
                    }).ToList()
                }).ToList(),
                dds = body.Elements(ns + "dd").Select(x => new {
                    l = (int)x.Attribute("l"),
                    r = (int)x.Attribute("r"),
                    b = (int)x.Attribute("b"),
                    paras = x.Elements(ns + "para").Select(y => new {
                        lns = y.Elements(ns + "ln").Select(z => new {
                            wds = z.Elements(ns + "wd").Select(a => new {
                                chs = a.Elements(ns + "ch").Select(b => new {
                                    l = (int?)b.Attribute("l"),
                                    t = (int?)b.Attribute("t"),
                                    r = (int?)b.Attribute("r"),
                                    b = (int?)b.Attribute("b"),
                                    conf = (int?)b.Attribute("conf"),
                                    value = (string)b
                                }).ToList()
                            }).ToList()
                        }).ToList()
                    }).ToList()
                }).ToList(),

            };
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM