[英]CURL Error in Encoding the Characters
我正在尝试从网页中获取一些数据。 但是问题不是拉说:
64 × 191 × 75 cm
它在回显上显示为
64 × 191 × 75 cm
我的代码:
<?php
$url = "http://www.google.co.uk"
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)");
curl_setopt($ch, CURLOPT_ENCODING ,"");
$html = curl_exec($ch);
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$q_Dimensions = "//tr/td[@class='FieldTitle'][contains(.,'Dimensions of packed product (W×H×D):')]/following-sibling::td/text()";
$dimentionsQ = $xpath->query($q_Dimensions);
$dimentions = $dimentionsQ->item(0)->nodeValue;
echo $dimentions;
exit();
我相信这可能是字符编码的某种问题,但无法进一步解决。 任何帮助深表感谢。
为CURLOPT_ENCODING设置另一个curl选项并将其设置为“”,以确保它不会返回任何垃圾
curl_setopt($ch, CURLOPT_ENCODING ,"");
另外,在header()
中将charset
设置为UTF-8
也可以正常工作:
// add this on the top of your php script
header('Content-Type: text/html; charset=utf-8');
$url = "google.co.uk";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)");
curl_setopt($ch, CURLOPT_ENCODING ,"");
$html = curl_exec($ch);
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$q_Dimensions = "//tr/td[@class='FieldTitle'][contains(.,'Dimensions of packed product (W×H×D):')]/following-sibling::td/text()";
$dimentionsQ = $xpath->query($q_Dimensions);
$dimentions = $dimentionsQ->item(0)->nodeValue;
echo $dimentions; // 64 × 191 × 75 cm
exit();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.