PHP DomDocument，DomXPath编码问题

Question

I'm having a problem with encoding from a wordpress feed that I just can't seem to figure out. 我似乎无法弄清来自wordpress提要的编码问题。

I was loading my feed with DOMDocument->load but then did a file_get_contents and am now using ->XMLload with the same results. 我正在使用DOMDocument-> load加载我的提要，但后来做了一个file_get_contents，现在使用-> XMLload具有相同的结果。 I did the XMLload so I could manipulate the feed if needed. 我做了XMLload，因此可以根据需要操纵提要。

The correct output that I'm looking for is - ' £ . 我要寻找的正确输出是- ' £ 。 If I just echo from a Xpath query, I get - â€˜ Â£ . 如果我只是从Xpath查询中回显，则会得到- â€˜ Â£ 。 If I echo with utf8_decode I get - ? £ 如果我用utf8_decode回显，则得到- ? £ - ? £ . - ? £ 。 A lot better but the question mark should be an apostrophe. 好多了，但问号应该是撇号。

If I loop through each node of the DomDocument when it is loaded, I get the correct output. 如果在加载DomDocument的每个节点时进行遍历，则会得到正确的输出。 So it seems that it's being handled incorrectly in XPath. 因此，似乎XPath中的处理方式不正确。

Any thought? 任何想法？

The feed is http://shredeasy.com/blog/category/news/feed 提要是http://shredeasy.com/blog/category/news/feed

Here is the function that is being called: 这是被调用的函数：

function getPostsInCategory($feed=NULL){
    if(is_null($feed)){ echo "Wrong Usage. Need a valid Category Feed.  Most likely from getCategories()."; return false; }
    $feedx = file_get_contents($feed);
    $xml = new DOMDocument();
    $xml->loadXML($feedx);
    //$this->showDOMNode($xml);


    //$xml->load($feed);
    $xpath = new DomXPath($xml);
    $xpath->registerNamespace("content", "http://web.resource.org/rss/1.0/modules/content/");

    $cat = array();
    foreach($xml->getElementsByTagName('item') as $c){
        $elements = array();
        $elements["title"] = $xpath->query("title", $c)->item(0)->nodeValue;
        echo utf8_decode($elements["title"]);

I have been trying to figure this out for hours and I keep circling back to the wrong thing. 我已经尝试了好几个小时才能弄清楚这个问题，但我总是回想起错误的事情。

Thanks for the help! 谢谢您的帮助！

You know right, it seems to be that apostrophes are turning into question marks....Gosh! 没错，似乎撇号正在变成问号...。天哪！ I don't know if that's the only issue or not. 我不知道这是否是唯一的问题。

Answer 1

The string being echoed is encoded in UTF-8. 回显的字符串以UTF-8编码。

If your page was encoded in UTF-8, you can just echo it, possibly calling htmlspecialchars with the third argument set to "UTF-8". 如果您的页面是使用UTF-8编码的，则只需回显它，就可以调用htmlspecialchars并将第三个参数设置为“ UTF-8”。
Otherwise, you have to convert it before to whatever encoding your webpage is using. 否则，您必须先将其转换为网页所使用的编码。 See iconv and mb_convert_encoding . 请参阅iconv和mb_convert_encoding 。

PHP DomDocument，DomXPath编码问题

问题描述

1 个解决方案

解决方案1
1 已采纳 2010-06-28 16:51:28

PHP DomDocument，DomXPath编码问题

问题描述

1 个解决方案

解决方案1 1 已采纳 2010-06-28 16:51:28

解决方案1
1 已采纳 2010-06-28 16:51:28