简体   繁体   中英

PHP cURL Website scraping not working

I have a cURL based code to fetch the price of the product from a website. I want to fetch scrape the result from http://www.snapdeal.com/product/apple-iphone-5s-16-gb/1302850866

The Price is placed in :

<div class="prodbuy-price">
<div id="mrp-price-outer" class="">
<div id="seller-price-outer" class="">
<div id="offer-price-id">
<meta content="INR" itemprop="priceCurrency">
<strong class="voucherPrice">
Rs
<span id="selling-price-id" itemprop="price">36500</span>
</strong>

My code for fetching the price is :

<?php
$curl = curl_init('http://www.snapdeal.com/product/apple-iphone-5s-16-gb/1302850866');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

$page = curl_exec($curl);

if(!empty($curl)){ //if any html is actually returned

    $pokemon_doc = new DOMDocument;
    libxml_use_internal_errors(true);
    $pokemon_doc->loadHTML($page);
    libxml_clear_errors(); //remove errors for yucky html

    $pokemon_xpath = new DOMXPath($pokemon_doc);

   // $price = $pokemon_xpath->evaluate('string(//div[@class="prices"]/meta[@itemprop="price"]/@content)');
   // echo $price;

    $rupees = $pokemon_xpath->evaluate('string(//div[@class="prodbuy-price"]/span[@itemprop="price"])');
    echo $rupees;
}
else {
    print "Not found";
}
?>

I am not getting any errors, nor any data (price) is displayed. I am not able to track any error.

There is a very silly mistake that I did : Adding an extra '/' had resolved the issue. Thanks to @DaveCoast for this. the new code is

<?php
$curl = curl_init('http://www.snapdeal.com/product/apple-iphone-5s-16-gb/1302850866');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

$page = curl_exec($curl);

if(!empty($curl)){ //if any html is actually returned

    $pokemon_doc = new DOMDocument;
    libxml_use_internal_errors(true);
    $pokemon_doc->loadHTML($page);
    libxml_clear_errors(); //remove errors for yucky html

    $pokemon_xpath = new DOMXPath($pokemon_doc);

   // $price = $pokemon_xpath->evaluate('string(//div[@class="prices"]/meta[@itemprop="price"]/@content)');
   // echo $price;

    $rupees = $pokemon_xpath->evaluate('string(//div[@class="prodbuy-price"]//span[@itemprop="price"])');
    echo $rupees;
}
else {
    print "Not found";
}
?>

Hope this helps someone!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM