简体   繁体   English

有没有办法从XmlReader读取原始内容?

[英]Is there a way to read raw content from XmlReader?

I have a very large XML file so I am using XmlReader in C#. 我有一个非常大的XML文件,所以我在C#中使用XmlReader。 Problem is some of the content contains XML-like markers that should not be processed by XmlReader. 问题是某些内容包含类似XML的标记,XmlReader不应对其进行处理。

<Narf name="DOH">Mark's test of <newline> like stuff</Narf>

This is legacy data, so it cannot be refactored... (of course) 这是旧数据,因此无法重构...(当然)

I have tried ReadInnerXml but get the whole node. 我已经尝试过ReadInnerXml,但是得到了整个节点。 I have tried ReadElementContentAsString but get an exception saying 'newline' is not closed. 我已经尝试过ReadElementContentAsString,但遇到一个异常,说“换行符”未关闭。

// Does not deal with markup in the content (Both lines)
ms.mText = reader.ReadElementContentAsString(); 
XElement el = XNode.ReadFrom(reader) as XElement; ms.mText = el.ToString();

What I want is ms.mText to equal "Mark's test of <newline> like stuff" and not an exception. 我想要的是ms.mText等于“对诸如<newline>的Mark的测试之类的东西”,而不是一个例外。

System.Xml.XmlException was unhandled
  HResult=-2146232000
  LineNumber=56
  LinePosition=63
  Message=The 'newline' start tag on line 56 position 42 does not match the end tag of 'Narf'. Line 56, position 63.
  Source=System.Xml

The duplicate flagged question did not solve the problem because it requires changing the input to remove the problem before using the data. 带重复标记的问题并不能解决问题,因为在使用数据之前,需要更改输入以消除问题。 As stated above, this is legacy data. 如上所述,这是旧数据。

I figured it out based on responses here! 我是根据这里的回答找出答案的! Not elegant, but works... 不优雅,但可以工作...

   public class TextWedge : TextReader
   {
      private StreamReader mSr = null;
      private string mBuffer = "";

      public TextWedge(string filename)
      {
         mSr = File.OpenText(filename);
         // buffer 50
         for (int i =0; i<50; i++)
         {
            mBuffer += (char) (mSr.Read());
         }
      }
      public override int Peek() 
      {
         return mSr.Peek() + mBuffer.Length;
      }

      public override int Read()
      {
         int iRet = -1;
         if (mBuffer.Length > 0)
         {
            iRet = mBuffer[0];
            int ic = mSr.Read();
            char c = (char)ic;
            mBuffer = mBuffer.Remove(0, 1);
            if (ic != -1)
            {
               mBuffer += c;
               // Run through the battery of non-xml tags
               mBuffer = mBuffer.Replace("<newline>", "[br]");
            }
         }
         return iRet;
      }
   }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM