简体   繁体   中英

Reading invalid xml with xDocument or XmlDocument c#

I have situation where I have used xslt to transform xml file.

Now I need to modify the result xml file which is not valid xml and xml parsers are not able to read it.

It doesn't start with xml declaration and there is no one root for the file.

I cannot change the structure of the file as that is another standard that I need to use but I need to add node inside the valid xml and also get information from specific node.

I already tried to use Like this

XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
doc.Load(InputFile);
doc.DocumentElement;

With this I only got stuff from inside the invalid XML but not from inside the valid XML

What I would really need is list of "validXmlWithDeclaration" nodes

structure is something like this.

<invalidXMLWithoutDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</invalidXMLWithoutDeclaration>
<validXmlWithDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</validXmlWithDeclaration>
<invalidXMLWithoutDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</invalidXMLWithoutDeclaration>
<validXmlWithDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</validXmlWithDeclaration>
<invalidXMLWithoutDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</invalidXMLWithoutDeclaration>
<validXmlWithDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</validXmlWithDeclaration>

Here is an example that parses the snippet you have shown by settings the InnerXml property of an XmlDocumentFragment and selects some of the elements in it:

        XmlDocument doc = new XmlDocument();
        XmlDocumentFragment fragment = doc.CreateDocumentFragment();
        fragment.InnerXml = @"<invalidXMLWithoutDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</invalidXMLWithoutDeclaration>
<validXmlWithDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</validXmlWithDeclaration>
<invalidXMLWithoutDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</invalidXMLWithoutDeclaration>
<validXmlWithDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</validXmlWithDeclaration>
<invalidXMLWithoutDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</invalidXMLWithoutDeclaration>
<validXmlWithDeclaration>
 <foo>
  <bar>
  </bar>
 </foo>
</validXmlWithDeclaration>";
        foreach (XmlElement el in fragment.SelectNodes("validXmlWithDeclaration"))
        {
            Console.WriteLine(el.OuterXml);
        }

You just don't have a well formed xml file. See my solution below with XmlRead and XDocument

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;


namespace ConsoleApplication62
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XmlReaderSettings settings = new XmlReaderSettings();
            settings.ConformanceLevel = ConformanceLevel.Fragment;
            XmlReader reader = XmlReader.Create(FILENAME);

            while (!reader.EOF)
            {
                if (reader.Name != "invalidXMLWithoutDeclaration")
                {
                    reader.ReadToFollowing("invalidXMLWithoutDeclaration");
                }
                if (!reader.EOF)
                {
                    XElement invalidXMLWithoutDeclaration = (XElement)XElement.ReadFrom(reader);
                }
            }

        }

    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM