简体   繁体   English

cURL 爬虫拒绝访问 PHP

[英]cURL Access Denied in crawler PHP

I'm creating a crawler to capture some public information.我正在创建一个爬虫来捕获一些公共信息。 However, it is returning:但是,它正在返回:

Access Denied拒绝访问
You don't have permission to access "http://www.americanas.com.br/" on this server.您无权访问此服务器上的“http://www.americanas.com.br/”。

Using Postman to test a request, cURL works perfectly.使用 Postman 测试请求,cURL 完美运行。 I even got the code generated by Postman (as shown below), but when I use it directly on my PHP server, return the error informed above.我什至得到了Postman生成的代码(如下图),但是当我直接在我的PHP服务器上使用时,返回上面提示的错误。

My cURL code:我的 cURL 代码:

$curl = curl_init();

curl_setopt_array($curl, array(
    CURLOPT_URL => "https://www.americanas.com.br/",
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET",
    CURLOPT_HTTPHEADER => array(
        "cache-control: no-cache",
        "postman-token: 112ebf89-1bb7-aa7a-0645-cdeabcf96488"
    ),
));

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if($err) echo "cURL Error #:" . $err;
else echo $response;
exit();

I found that there are sites with more complex locks.我发现有些站点的锁更复杂。 In these cases, it is necessary to use more complete crawler solutions.在这些情况下,有必要使用更完整的爬虫解决方案。 The one I'm using and is working is Proxycawl ( https://proxycrawl.com/ ).我正在使用和工作的是 Proxycawl ( https://proxycrawl.com/ )。

Your postman is querying https ://www.americanas.com.br/ while from the error message we can suppose that in your crawler you are querying http ://www.americanas.com.br/您的 postman 正在查询https ://www.americanas.com.br/ 而从错误消息我们可以假设您在您的爬虫中正在查询http ://www.americanas.com.br/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM