在MSXML中使用nbsp解析HTML块

Question

I'm trying to load a chunk of HTML into MSXML's DOMDocument. 我正在尝试将HTML的一部分加载到MSXML的DOMDocument中。 The said chunk is valid XML with one excepton - it has   所说的块是带有一个例外的有效XML-它具有  entities. 实体。 MSXML chokes on them, claims "Reference to undefined entity 'nbsp'.". MSXML对此感到cho之以鼻，声称“对未定义实体'nbsp'的引用”。

Can I make MSXML recognize it as valid somehow? 我可以使MSXML以某种方式将其识别为有效吗？

Answer 1

Simple solution: Just run a text replacement of " " 简单的解决方案：只需对“＆nbsp;”进行文本替换 to " " before parsing the document. 解析文档之前，请先将“ Which should work, since there cannot be a verbatim   哪个应该起作用，因为不能有逐字记录＆nbsp; in the text, which should not be replaced. 在文本中，不应替换。

More standard solution: Declare a nbsp; 更标准的解决方案：声明 entity in the xml, by inserting 通过插入xml中的实体

<!DOCTYPE foobar [
   <!ENTITY nbsp " " >
]>

before the xml root node. xml根节点之前。

You can also use "0xA0" and   您还可以使用“ 0xA0”和＆＃x00A0; if you actually want a non-breaking space, instead of a normal space 如果您实际上想要一个不间断的空间，而不是一个正常的空间

在MSXML中使用nbsp解析HTML块

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-02-28 20:27:52

在MSXML中使用nbsp解析HTML块

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-02-28 20:27:52

解决方案1
1 已采纳 2013-02-28 20:27:52