简体   繁体   English

特殊字符的转换,同时将其添加到 C# 中的 XML 内部文本

[英]Conversion of the special characters while adding it to the XML innertext in C#

While writing the inner text I need to use the hexadecimal code for special characters, but not able to add it.在编写内部文本时,我需要对特殊字符使用十六进制代码,但无法添加它。 I tried some encoding changes but it is not working.我尝试了一些编码更改,但它不起作用。 I need output like我需要像这样的输出

–CO–OR instead of "–CO–OR" –CO–OR代替"–CO–OR"

"+" instead of "+"而不是"+"

Code which I am trying to convert is provided below.下面提供了我要转换的代码。

else
{
  //convertedStr = System.Net.WebUtility.HtmlDecode(runText);
  Encoding iso = Encoding.Default; 
  Encoding utf8 = Encoding.Unicode;
  byte[] utfBytes = utf8.GetBytes(runText);
  byte[] isoBytes = Encoding.Convert(iso, utf8, utfBytes);
  string msg = iso.GetString(isoBytes);    
  eqnPartElm = clsGlobal.XMLDoc.CreateElement("inf");
  eqnPartElm.InnerText = msg;
  eqnElm.AppendChild(eqnPartElm);   
}

Escaping of Unicode characters is not modeled or controlled by XmlDocument . Unicode 字符的转义不是由XmlDocument建模或控制的。 Instead, XmlWriter will escape characters in character data and attribute values not supported by the current encoding , as specified by XmlWriterSettings.Encoding , at the time the document is written to a stream.相反,在将文档写入流时, XmlWriter将转义当前encoding不支持的字符数据属性值中的字符,如XmlWriterSettings.Encoding指定的那样。 If you want all "special characters" such as the En Dash to be escaped, choose a very restrictive encoding such as Encoding.ASCII .如果您希望转义所有“特殊字符”,例如 En Dash,请选择非常严格的编码,例如Encoding.ASCII

To do this easily, create the following extension methods:要轻松做到这一点,请创建以下扩展方法:

public static class XmlSerializationHelper
{
    public static string GetOuterXml(this XmlNode node, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false)
    {
        if (node == null)
            return null;
        using var stream = new MemoryStream();
        node.Save(stream, indent : indent, encoding : encoding, omitXmlDeclaration : omitXmlDeclaration, closeOutput : false);
        stream.Position = 0;
        using var reader = new StreamReader(stream);
        return reader.ReadToEnd();
    }

    public static void Save(this XmlNode node, Stream stream, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false, bool closeOutput = true) =>
        node.Save(stream, new XmlWriterSettings
                  {
                      Indent = indent,
                      Encoding = encoding,
                      OmitXmlDeclaration = omitXmlDeclaration,
                      CloseOutput = closeOutput,
                  });

    public static void Save(this XmlNode node, Stream stream, XmlWriterSettings settings)
    {
        using (var xmlWriter = XmlWriter.Create(stream, settings))
        {
            node.WriteTo(xmlWriter);
        }
    }
}

And now you will be able to do the following to serialize an XmlDocument to a string with non-ASCII characters escaped:现在您将能够执行以下操作来将XmlDocument序列化为带有非 ASCII 字符转义的字符串:

// Construct your XmlDocument (not shown in the question)
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml("<Root></Root>");
var eqnPartElm = xmlDoc.CreateElement("inf");
xmlDoc.DocumentElement.AppendChild(eqnPartElm);

// Add some non-ASCII text (here – is an En Dash character).
eqnPartElm.InnerText = "–CO–OR";

// Output to XML and escape all non-ASCII characters.
var xml = xmlDoc.GetOuterXml(indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);

To serialize to a Stream , do:要序列化为Stream ,请执行以下操作:

using (var stream = new FileStream(fileName, FileMode.OpenOrCreate))
{
    xmlDoc.Save(stream, indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
}

And the following XML will be created:并且将创建以下 XML:

<Root>
  <inf>&#x2013;CO&#x2013;OR</inf>
</Root>

Notes:笔记:

  • You must use the new XmlWriter not the old XmlTextWriter as the latter does not support replacing unsupported characters with escaped fallbacks.您必须使用新的XmlWriter而不是旧的XmlTextWriter ,因为后者不支持用转义后备替换不受支持的字符。

  • Some parts of an XML document, including element and attribute names and comment text, do not support inclusion of character entities. XML 文档的某些部分,包括元素和属性名称以及注释文本,不支持包含字符实体。 If you attempt to write an unsupported character in such a situation, XmlWriter will throw an exception.如果您尝试在这种情况下写入不受支持的字符, XmlWriter将引发异常。

Demo fiddle here .演示小提琴在这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM