简体   繁体   English

XML错误:无效字符

[英]XML Error: Invalid character

I have the below php code that is parsing xml from url 我有下面的PHP代码从URL解析XML

$parser=xml_parser_create();

function char($parser,$data)
  {
  echo $data;
  }

xml_set_character_data_handler($parser,"char");
$fp=fopen("http://example.com","r");

while ($data=fread($fp,4096))
  {
  xml_parse($parser,$data,feof($fp)) or 
  die (sprintf("XML Error: %s at line %d", 
  xml_error_string(xml_get_error_code($parser)),
  xml_get_current_line_number($parser)));
  }

The xml returned by above fopen call is like this.The Xml don't have any encoding set at top. 上面的fopen调用返回的xml是这样的。Xml顶部没有设置任何编码。 The above code is outputing XML Error: Invalid character at line 1008 on browser. 上面的代码在浏览器的第1008行输出XML错误:无效字符。

<entries> //root element
    <entry>
   <TITLE><![CDATA[xxxx yyyyyyyyyy]]></TITLE>
    </entry>
    <entry>
    <TITLE><![CDATA[xxxx Gold… yyyyyyyyyy]]></TITLE>//this is line no 1008 that returns invalid character error and script stops
    </entry>
</entries>

I think it might be due to ellipses because When I save the xml returned in local file in notepad++ and then feed that xml file the above parser runs good. 我认为这可能是由于省略号引起的,因为当我将返回的xml保存在notepad ++中的本地文件中,然后提供该xml文件时,上面的解析器运行良好。

I want to run this xml directly from url instead of saving it into directory because that will be a overhead I don't need.Thanks 我想直接从url运行此xml,而不是将其保存到目录中,因为这是我不需要的开销。

Make sure the web server you're pulling the file from is sending the correct character encoding when it serves the page. 确保从中提取文件的Web服务器在为页面提供服务时发送正确的字符编码。 You should see something like this in the response headers: 您应该在响应标题中看到以下内容:

Content-Type:"text/xml; charset=utf-8"

The headers can be viewed in the network panel of the inspector in any modern browser, when you request the XML file directly. 当您直接请求XML文件时,可以在检查器的网络面板中的任何现代浏览器中查看标题。

You should also specify the encoding in the file itself. 您还应该在文件本身中指定编码。 The first line should look something like this: 第一行应如下所示:

<?xml encoding='UTF-8'?>

If these fail, you can always try using utf8_decode() which is an XML_Parser function that will attempt to convert the data to iso-8859-1. 如果这些操作失败,则可以始终尝试使用utf8_decode() ,它是一个XML_Parser函数,它将尝试将数据转换为iso-8859-1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM