简体   繁体   中英

Extracting the value from webpage using simple html dom

I have searched over net and found the way to extract data using simple html dom but it's giving me the following error:

Warning: file_get_contents( http://www.flipkart.com/moto-g-2nd-gen/p/itme6g3wferghmv3 ): failed to open stream: HTTP request failed! HTTP/1.1 500 Server Error in C:\\Users\\Abhishek\\Desktop\\editor\\request\\simple_html_dom.php on line 75

Fatal error: Call to a member function find() on boolean in C:\\Users\\Abhishek\\Desktop\\editor\\request\\main.php on line 9

My designed php code for it is:

<?php 

include('simple_html_dom.php');

$html = file_get_html('http://www.flipkart.com/moto-g-2nd-gen/p/itme6g3wferghmv3');


foreach($html->find('span.selling-price.omniture-field') as $e)
    echo $e->outertext . '<br>';

?>

I an new in this programming and don't have enough knowledge but is there any mistake in my program?

Make sure fopen wrappers are enabled to do this.. From the manual :

A URL can be used as a filename with this function if the fopen wrappers have been enabled.

As a result of this being disabled file_get_contents() returns false which causes your second error.

The server is probably rejecting your request based on the User-Agent, try using curl to get the page html, ie

<?php
$url="http://www.flipkart.com/moto-g-2nd-gen/p/itme6g3wferghmv3";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_USERAGENT, "User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_ENCODING, "");
$pagebody=curl_exec($ch);
curl_close ($ch);

include('simple_html_dom.php');
$html = str_get_html($pagebody);

foreach($html->find('.selling-price') as $e)
    echo $e->outertext . '<br>';

Output:

Rs. 10,999


Note:

I can confirm the server is rejecting your request based on the User-Agent.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM