![](/img/trans.png)
[英]Simple html dom only parsing the titles and links that contain pdf links
[英]PHP Simple HTML Dom not parsing certain links
我正在了解HTML DOM解析器及其工作方式。 我有一個障礙,無法解析以下鏈接,但可以解析根域和其他網站。 有人可以幫助我了解為什么我無法解析此特定鏈接嗎?
<?php
include('simple_html_dom.php');
$base = 'http://www.stupidstudios.com/samsung-galaxy-s6/p/bbuynow';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
$html_base = new simple_html_dom();
$html_base->load($str);
foreach($html_base->find('h1') as $element) {
echo "<pre>";
print_r( $element );
echo "</pre>";
}
$html_base->clear();
unset($html_base);
?>
當您使用瀏覽器代理添加/欺騙它時,它似乎可以工作:
$base = 'http://www.flipkart.com/samsung-galaxy-s6/p/itme5z4aypvtrxmy';
$curl = curl_init($base);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
echo $str;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.