简体   繁体   English

如何解决“simplexml_load_file() 解析器错误:未定义实体 'nbsp'”?

[英]How do I troubleshoot “simplexml_load_file() parser error: Entity 'nbsp' not defined”?

I use PHP to generate XML files.我使用 PHP 生成 XML 文件。 I have use some code below to avoid error.我在下面使用了一些代码来避免错误。

$str = str_ireplace(array('<','>','&','\'','"'),array('&lt;','&gt;','&amp;','&apos;','&quot;'),$str);

but still cause fault.但仍然会导致故障。

simplexml_load_file() [function.simplexml-load-file] *[file name]* parser error : Entity 'nbsp' not defined in *[file name] [line]*

The error text here:这里的错误文本:

Dallas&nbsp;&nbsp;Dallas () is the third-largest city in Texas and the ninth-largest in the United States.

In IE8, it seems to fault in () .在 IE8 中, ()似乎有问题。 So how many symbols should I notice?那么我应该注意多少个符号呢?

&nbsp; is a HTML entity, but doesn't exist in XML.是一个 HTML 实体,但在 XML 中不存在。

Either get rid of it (you're not saying where it comes from, so it's hard to give any more specific advice), or wrap your HTML data in CDATA blocks so the parser ignores them.要么摆脱它(你没有说它来自哪里,所以很难给出更具体的建议),或者将你的 HTML 数据包装在CDATA块中,以便解析器忽略它们。

HTML specific entities - in this case &nbsp; HTML 特定实体 - 在这种情况下&nbsp; - are not valid xml entities, and that is what simplexml complains about; - 不是有效的 xml 实体,这就是 simplexml 所抱怨的; it reads the file as xml (not html) and finds entities which are not valid.它将文件读取为 xml(不是 html)并找到无效的实体。 You need to convert HTML entities back to their character representation first (you can use html_entity_decode() to do that)您需要先将 HTML 实体转换回其字符表示(您可以使用html_entity_decode()来执行此操作)

$str = "some string containing html";
// this line will convert back html entities to regular characters
$str = html_entity_decode($str, ...);
// now convert special character to their xml entities
$str = str_ireplace(array('<','>','&','\'','"'),array('&lt;','&gt;','&amp;','&apos;','&quot;'),$str);

save_to_xml($str);

Note that if you use htmlentities() on your string before saving it in the xml, then that is the source of your problem (as you are converting html character to their respective html entities, which are not recognized by simplexml as xml entities). Note that if you use htmlentities() on your string before saving it in the xml, then that is the source of your problem (as you are converting html character to their respective html entities, which are not recognized by simplexml as xml entities).

// this won't work, the html entities it will uses are not valid xml entities
$str = htmlentities($str, ...)

save_to_xml($str);

If you have troubles understanding it, think of it as two different languages, like spanish (html) and english (xml), a valid word in spanish ( ) doesn't mean it is also valid in english, no matter the similarities between the two languages.如果您无法理解它,请将其视为两种不同的语言,例如西班牙语 (html) 和英语 (xml),西班牙语 ( ) 中的有效单词并不意味着它在英语中也有效,无论两者之间的相似之处如何两种语言。

&nbsp ; &nbsp ; is no-breaking space.是不间断的空间。 You have to replace it.你必须更换它。 http://en.wikipedia.org/wiki/Non-breaking_space http://en.wikipedia.org/wiki/Non-driving_space

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM