简体   繁体   English

php curl 在一个本地主机上工作,但不在其他本地主机上工作

[英]php curl working on one localhost but not on other

I got an order from my client to scrape a website using php curl.我从我的客户那里得到了一个使用 php curl 抓取网站的命令。 I did the job and the script was working fine on my localhost.我完成了这项工作,脚本在我的本地主机上运行良好。 But when I gave it to my client script was not working on his localhost.但是当我把它交给我的客户端脚本时,他的本地主机上没有工作。

<?php

ini_set('display_errors', 'On');
error_reporting(E_ALL);

print "Cascading https://www.autotrader.ca/cars/on/toronto/?rcp=15&rcs=0&prx=100&prv=Ontario&loc=toronto%2C%20on&hprc=True&wcp=True&sts=New-Used&inMarket=basicSearch&mdl=Accent&make=Hyundai&scrladid=11543266:<p>";

$array = [];
$array[] = "/a/hyundai/accent/oshawa/ontario/19_11543266_/?showcpo=ShowCpo&amp;ncse=no&amp;orup=1_15_340&amp;sprx=100";
$array[] = "/a/hyundai/accent/cambridge/ontario/5_48590586_20200220145456261/?showcpo=ShowCpo&amp;ncse=no&amp;orup=2_15_340&amp;sprx=100";
$array[] = "/a/hyundai/accent/mississauga/ontario/19_11536424_/?showcpo=ShowCpo&amp;ncse=no&amp;orup=3_15_340&amp;sprx=100";

foreach ($array as $key=>$value)
{
    $scrape = "https://www.autotrader.ca".$array[$key];
    print "Scraping $scrape<p>";
    echo "<br>";

    $user_agent = 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Mobile Safari/537.36';

    $headers = [
        'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'accept-encoding: gzip, deflate, br',
        'accept-language: en-US,en;q=0.9',
        'cache-control: max-age=0',
        'sec-fetch-dest: document',
        'sec-fetch-mode: navigate',
        'sec-fetch-site: none',
        'sec-fetch-user: ?1',
        'upgrade-insecure-requests: 1',
        'user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Mobile Safari/537.36',
    ];
    
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $scrape);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 15);
    curl_setopt($ch, CURLOPT_TIMEOUT, 100);
    curl_setopt($ch, CURLOPT_ENCODING, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    // curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Length: 0'));
    curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
    curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');

    $contents = curl_exec($ch);
    
    if ($contents === FALSE){
        echo "Error : ".curl_error($ch);
        echo "<br>";
        print "contents returned for $key = FALSE<br>";
    }

    curl_close($ch);
    
    // echo $contents;

    $start_pos = strpos($contents, "<title>", 0);
    $end_pos = strpos($contents, "</title>", 0);
    $title = substr($contents, $start_pos+7, $end_pos-$start_pos);
    
    print "Listing $key: $title<p>";
    echo "<br>";
    echo "<br>";
}

He also told that he was scraping website before not using curl but with any other method and he thinks that they have restricted his requests to their server but please note that he can still visit the website in the browser.他还告诉他,在没有使用 curl 而是使用任何其他方法之前,他正在抓取网站,他认为他们已将他的请求限制在他们的服务器上,但请注意,他仍然可以在浏览器中访问该网站。 I checked that he was able to get correct response if he replace the url with google url in curl.如果他在 curl 中用 google url 替换 url,我检查了他是否能够得到正确的响应。

The most likely issue here is that your client's installation of PHP does not have the php-curl extension installed or enabled.这里最可能的问题是您的客户端安装的 PHP 没有安装或启用 php-curl 扩展。 This is achieved differently depending on your OS and how PHP was installed but here are a few common situations:这取决于您的操作系统以及 PHP 的安装方式而有所不同,但这里有一些常见情况:

For Ubuntu or other Debian based Linux distributions:对于Ubuntu或其他基于 Debian 的 Linux分布:

apt-get install php7.4-curl
systemctl restart apache2

Replace '7.4' with the version of PHP that you are currently using in the first command将“7.4”替换为您当前在第一个命令中使用的 PHP 版本

For WAMP on Windows : How to enable curl in Wamp server对于 Windows 上的 WAMP如何在 Wamp 服务器中启用 curl

For XAMPP on Windows : How to enable cURL in PHP / XAMPP For XAMPP on Windows : How to enable cURL in PHP / XAMPP

Running it behind a proxy, working fine.在代理后面运行它,工作正常。 Simplified and corrected some little mistakes.简化并纠正了一些小错误。

Try this and do not forget to comment/edit the CURLOPT_PROXY line.试试这个,不要忘记评论/编辑 CURLOPT_PROXY 行。

<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);

$array = [
    "/a/hyundai/accent/oshawa/ontario/19_11543266_/?showcpo=ShowCpo&amp;ncse=no&amp;orup=1_15_340&amp;sprx=100",
    "/a/hyundai/accent/cambridge/ontario/5_48590586_20200220145456261/?showcpo=ShowCpo&amp;ncse=no&amp;orup=2_15_340&amp;sprx=100",
    "/a/hyundai/accent/mississauga/ontario/19_11536424_/?showcpo=ShowCpo&amp;ncse=no&amp;orup=3_15_340&amp;sprx=100"
];

foreach ($array as $key => $value) {
    $scrape = "https://www.autotrader.ca" . $value;
    echo "Scraping " . $scrape . "<br>\n";
    
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $scrape);
    curl_setopt($ch, CURLOPT_PROXY, "http://<proxy_url>:80"); // Comment if not behind a proxy
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
    curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
    $contents = curl_exec($ch);

    if (curl_error($ch)) {
        echo "Error : " . curl_error($ch) . "<br>\n";
        break;
    }
    curl_close($ch);

    $title = explode("<title>", $contents);
    $title = explode("</title>", $title[1]);
    $title = $title[0];

    echo "Listing " . $key . ": " . $title . "<br>\n";
    echo "<br>\n";
    echo "<br>\n";
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM