简体   繁体   English

在元素中包含 XML CDATA

[英]Include XML CDATA in an element

UPDATE: Added more detail per request更新:为每个请求添加了更多详细信息

I am trying to create an xml configuration file for my application.我正在尝试为我的应用程序创建一个 xml 配置文件。 The file contains a list of criteria to search and replace in an html document.该文件包含要在 html 文档中搜索和替换的条件列表。 The problem is, I need to search for character strings like &nbsp .问题是,我需要搜索像&nbsp这样的字符串。 I do not want my code to read the decoded item, but the text itself.我不希望我的代码读取解码后的项目,而是读取文本本身。

Admitting to being very new to XML, I did make some attempts at meeting the requirements.承认自己对 XML 很陌生,我确实做了一些尝试来满足这些要求。 I read a load of links here on Stackoverflow regarding CDATA and ATTRIBUTES and so on, but the examples here (and elsewhere) seem to focus on creating one single line in an xml file, not multiple.我在 Stackoverflow 上阅读了大量关于CDATAATTRIBUTES等的链接,但这里(和其他地方)的示例似乎专注于在 xml 文件中创建一行,而不是多行。

Here is one of many attempts I have made to no avail:这是我所做的许多尝试之一,但无济于事:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE item [
  <!ELEMENT item (id, replacewith)>
  <!ELEMENT id (#CDATA)>
  <!ELEMENT replacewith (#CDATA)>
  ]>
]>
<item id=" " replacewith="&nbsp;">Non breaking space</item>
<item id="&#8209;" replacewith="-">Non breaking hyphen</item>

This document gives me a number of errors, including:这份文件给了我一些错误,包括:

  • In the DOCTYPE, I get errors like <!ELEMENT id (#CDATA)> .在 DOCTYPE 中,我收到类似<!ELEMENT id (#CDATA)> In the CDATA area, Visual Studio informs me it is expecting a ',' or '|'.在 CDATA 区域中,Visual Studio 通知我它需要一个“,”或“|”。
  • ]> gives me an error of invalid token at the root of the document . ]> invalid token at the root of the document给我一个invalid token at the root of the document的错误。
  • And of course, after the second <item entry, I get an error stating XML document cannot contain multiple root level elements .当然,在第二个<item条目之后,我收到一条错误消息,指出XML document cannot contain multiple root level elements

How can I write an xml file that includes multiple items and allows me to store and retrieve the text within the element, rather than the interpreted characters?如何编写包含多个项目的 xml 文件允许我在元素中存储和检索文本,而不是解释的字符?

If it helps any, I am using .Net, C#, and Visual Studio.如果有帮助,我正在使用 .Net、C# 和 Visual Studio。

EDIT: The purpose of this xml file is to provide my code with a list of things to search and replace in an html file.编辑:此 xml 文件的目的是为我的代码提供要在 html 文件中搜索和替换的内容列表。 The xml file simply contains a list of what to search for and what to replace with .该 xml 文件仅包含what to search forwhat to replace with内容的列表。

Here is the file I have in place right now:这是我现在拥有的文件:

<?xml version="1.0" encoding="utf-8" ?>
<Items>
  <item id="&#8209;" replacewith="-">Non breaking hyphen</item>
  <item id=" " replacewith="&nbsp;">Non breaking hyphen</item>
</Items>

Using the first as an example, I want to read the text &#8209 but instead when I read this, I get - because that is what the code represents.以第一个为例,我想阅读文本&#8209但当我阅读本文时,我明白了-因为这就是代码所代表的意思。

Any help or pointers you can give would be helpful.您可以提供的任何帮助或指示都会有所帮助。

To elaborate on my comment: XML acts like HTML due to the reserved characters.详细说明我的评论:由于保留字符,XML 的行为类似于 HTML。 An ampersand prefixes keywords or character codes to translate into a literal string when read in with any type of parser (browser, XML reader, etc).当使用任何类型的解析器(浏览器、XML 阅读器等)读入时,与号前缀关键字或字符代码以转换为文字字符串。

The easiest way to escape the values to make sure they are read back in as the literal that you want is to put them in as if you were encoding it for web.对这些值进行转义以确保它们作为您想要的文字读回的最简单方法是将它们放入,就像您为 Web 对其进行编码一样。 For example, to create your XML document, I did this:例如,要创建您的 XML 文档,我是这样做的:

     XmlDocument xmlDoc = new XmlDocument();
     XmlElement xmlItem;
     XmlAttribute xmlAttr;
     XmlText xmlText;

     // Declaration
     XmlDeclaration xmlDec = xmlDoc.CreateXmlDeclaration("1.0", "UTF-8", null);
     XmlElement xmlRoot = xmlDoc.DocumentElement;
     xmlDoc.InsertBefore(xmlDec, xmlRoot);

     // Items
     XmlElement xmlItems = xmlDoc.CreateElement(string.Empty, "Items", string.Empty);
     xmlDoc.AppendChild(xmlItems);

     // Item #1
     xmlItem = xmlDoc.CreateElement(string.Empty, "item", string.Empty);
     xmlAttr = xmlDoc.CreateAttribute(string.Empty, "id", string.Empty);
     xmlAttr.Value = "&#8209;";
     xmlItem.Attributes.Append(xmlAttr);
     xmlAttr = xmlDoc.CreateAttribute(string.Empty, "replacewith", string.Empty);
     xmlAttr.Value = "-";
     xmlItem.Attributes.Append(xmlAttr);
     xmlText = xmlDoc.CreateTextNode("Non breaking hyphen");
     xmlItem.AppendChild(xmlText);

     xmlItems.AppendChild(xmlItem);

     // Item #2
     xmlItem = xmlDoc.CreateElement(string.Empty, "item", string.Empty);
     xmlAttr = xmlDoc.CreateAttribute(string.Empty, "id", string.Empty);
     xmlAttr.Value = " ";
     xmlItem.Attributes.Append(xmlAttr);
     xmlAttr = xmlDoc.CreateAttribute(string.Empty, "replacewith", string.Empty);
     xmlAttr.Value = "&nbsp;";
     xmlItem.Attributes.Append(xmlAttr);
     xmlText = xmlDoc.CreateTextNode("Non breaking hyphen");
     xmlItem.AppendChild(xmlText);

     xmlItems.AppendChild(xmlItem);

     // For formatting
     StringBuilder xmlBuilder = new StringBuilder();
     XmlWriterSettings xmlSettings = new XmlWriterSettings
     {
        Indent = true,
        IndentChars = "  ",
        NewLineChars = "\r\n",
        NewLineHandling = NewLineHandling.Replace
     };
     using (XmlWriter writer = XmlWriter.Create(xmlBuilder, xmlSettings))
     {
        xmlDoc.Save(writer);
     }

     xmlOutput.Text = xmlBuilder.ToString();

Notice that I put in your id values with what you are expecting.请注意,我将您的id值与您期望的值一起放入。 Now, look at how it gets encoded:现在,看看它是如何编码的:

<?xml version="1.0" encoding="utf-16"?>
<Items>
  <item id="&amp;#8209;" replacewith="-">Non breaking hyphen</item>
  <item id=" " replacewith="&amp;nbsp;">Non breaking hyphen</item>
</Items>

The only difference between yours and this one is that the ampersand was encoded as &amp;你的和这个之间的唯一区别是&符号被编码为&amp; and the rest remained as a string literal.其余的保留为字符串文字。 This is normal behavior for XML.这是 XML 的正常行为。 When you read it back in, it will come back as the literal &#8209;当你读回它时,它会以文字&#8209;返回&#8209; and &nbsp;&nbsp; . .

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM