简体   繁体   中英

Preserve white space and new lines in xdocument

i have xml file which looks something like this,

Question) how do I preserve all the white space and new line breaks while loading the document, when I load it through XDcoument. LoadOptions.PreseveWhitesapce does not work

Thanks.

 <!--
********************************************************
 header
********************************************************
    -->
   <!--sample -->
   <realmCode
      code="US"/>
   
   <!-- sample -->
   <typeId
      root="2.16.840.1.113883.1.3"
      extension="samo"/>
   
   <!-- sample -->
   <!-- sample -->
   <templateId
      root="2.16.840.1.113883.10.20.22.1.1"/>
   <!-- *** formatting. *** -->
   <!-- formatting -->
   <templateId
      root="2.16.840.1.113883.10.20.22.1.2"/>
   
   <!-- formatting -->
   <id`
      extension="samo"
      root="1.1.1.1.1.1.1.1.1"/>
   
   <!--formatting -->"

Your XML example have few troubles, which should be "fixed" before parsing XML:

  1. Missing root element. It could be added manually before parsing your example.
  2. Invalid grave accent character ( ` , at "<id` extension..."). Should be removed.
  3. Trailing double quote without purpose.
  4. Line breaks (doesn't affects on parse, but should be fixed too). Whitespaces isn't trouble at all.

So, to fix them all, first you should read XML file as simple single string with System.IO.File.ReadAllText . Then you can use Regex class from System.Text.RegularExpressions namespace and its method Replace() with pattern " @"[`\\r\\n]" " to remove line breaks and invalid grave accent char at. Double quote at the end of document could be simply trimmed with char at. Double quote at the end of document could be simply trimmed with Trim` method.

As your XML example haven't root element, what would cause System.Xml.XmlException with Missing root element message when you will try to Parse it, we add it manually with concatenation of some root tag: "<root>" + fixedXmlString + "</root>" .

Whole piece of code looks like that:

static void Main()
{
    // Reading XML file as string.
    // Replacing invalid grave accent ` 
    // Replacing line breaks
    // Trimming trailing double quote
    var xmlString = Regex.Replace(File.ReadAllText("example.xml"), @"[`\r\n]", "").Trim('\"');

    // Adding some root element as it doesn't exists in example
    xmlString = "<root>" + xmlString + "</root>";

    // Now it parsable
    XDocument xDoc = XDocument.Parse(xmlString);

    // Save as correct one
    xDoc.Save("example_fixed.xml");
}

Output (after .Save() ) looks like this:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM