简体   繁体   English

为什么我使用curl得到错误的数据?

[英]why do I get wrong data using curl?

I try to get rss, I get wrong data for some reason: 我试图获取rss,由于某种原因我得到了错误的数据:

$url = "http://rss.news.yahoo.com/rss/oddlyenough";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
$xml = curl_exec($ch);      
curl_close($ch);
echo htmlentities($xml, ENT_QUOTES, "UTF-8");

The output: 输出:

<!-- rc2.ops.ch1.yahoo.com uncompressed/chunked Sun Nov 25 15:57:06 UTC 2012 --> 

If I try to load this data other way I get correct data. 如果我尝试以其他方式加载此数据,我会得到正确的数据。 For example this one works: 例如,这个工作:

$xml = simplexml_load_file('http://rss.news.yahoo.com/rss/oddlyenough');
print "<ul>\n";
foreach ($xml->channel->item as $item){
  print "<li>$item->title</li>\n";
}
print "</ul>";

Could you please tell me what's the problem with code using curl? 你能告诉我使用curl的代码有什么问题吗?

You're running against a Location snag. 你正在对抗一个Location障碍。

Add this option: 添加此选项:

  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

so as to have: 以便:

$url = "http://rss.news.yahoo.com/rss/oddlyenough";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
$xml = curl_exec($ch);      
curl_close($ch);
echo htmlentities($xml, ENT_QUOTES, "UTF-8");

Details 细节

When you run the above code, the first answer you receive from Yahoo! 当您运行上面的代码时,您从Yahoo!收到的第一个答案 is: 是:

HTTP/1.0 301 Moved Permanently
Date: Sun, 25 Nov 2012 16:31:36 GMT
P3P: policyref="http://info.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC GOV"
Cache-Control: max-age=3600, public
Location: http://news.yahoo.com/rss/oddlyenough
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
Age: 1586
Content-Length: 81
Via: HTTP/1.1 rc4.ops.ch1.yahoo.com (YahooTrafficServer/1.20.10 [cHs f ])
Server: YTS/1.20.10

<!-- rc4.ops.ch1.yahoo.com uncompressed/chunked Sun Nov 25 16:31:36 UTC 2012 -->

and it tells you to use the new address http://news.yahoo.com/rss/oddlyenough . 并告诉您使用新地址http://news.yahoo.com/rss/oddlyenough

Actually, if you use directly the new address, your original code works (until they change the address again, that is...) and is a bit faster, making only one request instead of two. 实际上,如果直接使用新地址,原始代码就会起作用 (直到它们再次更改地址,即...)并且速度更快,只需要一个请求而不是两个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM