[英]PHP DomDocument, DomXPath encoding issue
I'm having a problem with encoding from a wordpress feed that I just can't seem to figure out. 我似乎无法弄清来自wordpress提要的编码问题。
I was loading my feed with DOMDocument->load but then did a file_get_contents and am now using ->XMLload with the same results. 我正在使用DOMDocument-> load加载我的提要,但后来做了一个file_get_contents,现在使用-> XMLload具有相同的结果。 I did the XMLload so I could manipulate the feed if needed. 我做了XMLload,因此可以根据需要操纵提要。
The correct output that I'm looking for is - ' £
. 我要寻找的正确输出是- ' £
。 If I just echo from a Xpath query, I get - ‘ £
. 如果我只是从Xpath查询中回显,则会得到- ‘ £
。 If I echo with utf8_decode I get - ? £
如果我用utf8_decode回显,则得到- ? £
- ? £
. - ? £
。 A lot better but the question mark should be an apostrophe. 好多了,但问号应该是撇号。
If I loop through each node of the DomDocument when it is loaded, I get the correct output. 如果在加载DomDocument的每个节点时进行遍历,则会得到正确的输出。 So it seems that it's being handled incorrectly in XPath. 因此,似乎XPath中的处理方式不正确。
Any thought? 任何想法?
The feed is http://shredeasy.com/blog/category/news/feed
提要是http://shredeasy.com/blog/category/news/feed
Here is the function that is being called: 这是被调用的函数:
function getPostsInCategory($feed=NULL){
if(is_null($feed)){ echo "Wrong Usage. Need a valid Category Feed. Most likely from getCategories()."; return false; }
$feedx = file_get_contents($feed);
$xml = new DOMDocument();
$xml->loadXML($feedx);
//$this->showDOMNode($xml);
//$xml->load($feed);
$xpath = new DomXPath($xml);
$xpath->registerNamespace("content", "http://web.resource.org/rss/1.0/modules/content/");
$cat = array();
foreach($xml->getElementsByTagName('item') as $c){
$elements = array();
$elements["title"] = $xpath->query("title", $c)->item(0)->nodeValue;
echo utf8_decode($elements["title"]);
I have been trying to figure this out for hours and I keep circling back to the wrong thing. 我已经尝试了好几个小时才能弄清楚这个问题,但我总是回想起错误的事情。
Thanks for the help! 谢谢您的帮助!
You know right, it seems to be that apostrophes are turning into question marks....Gosh! 没错,似乎撇号正在变成问号...。天哪! I don't know if that's the only issue or not. 我不知道这是否是唯一的问题。
The string being echoed is encoded in UTF-8. 回显的字符串以UTF-8编码。
htmlspecialchars
with the third argument set to "UTF-8". 如果您的页面是使用UTF-8编码的,则只需回显它,就可以调用htmlspecialchars
并将第三个参数设置为“ UTF-8”。 iconv
and mb_convert_encoding
. 请参阅iconv
和mb_convert_encoding
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.