简体   繁体   English

CURL PHP 爬虫返回访问被拒绝错误

[英]CURL PHP Crawler Returns Access Denied Error

My dad loves these frozen cheeseburgers from meijer so I was gonna write a little script I can run in cron that will check Meijer's website and txt or email or something if they go on sale.我爸爸喜欢 meijer 的这些冷冻芝士汉堡,所以我要写一个小脚本,我可以在 cron 中运行,它会检查 Meijer 的网站和 txt 或 email 或其他东西,如果他们 go 出售的话。

Whenever I run the below script I get an Access Denied response from the server instead of the html for the cheesburger page.每当我运行以下脚本时,我都会从服务器而不是芝士汉堡页面的 html 收到拒绝访问响应。

I'm sure I just need a CURL option or something.我确定我只需要一个 CURL 选项或其他东西。

Thank You in Advance先感谢您

function curl_download($Url)
{
    if (!function_exists('curl_init')){die('cURL is not installed. Install and try again.');}

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $Url);

    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_HEADER, true);

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0");

    $output = curl_exec($ch);
    curl_close($ch);

        return $output;
}

print curl_download("https://www.meijer.com/shop/en/frozen/frozen-meals/sandwiches/meijer-bacon-cheeseburger-4-9-oz/p/71373326278");

You very likely have a few more header settings to add before the server is unable to detect that you are using curl and therefore not a regular human user.在服务器无法检测到您正在使用 curl 并且因此不是普通的人类用户之前,您很可能需要添加更多 header 设置。

Can servers block curl requests? 服务器可以阻止 curl 请求吗?

Some websites have even more advanced techniques to determine whether or not you are a human user, such as whether or not JavaScript loaded on an earlier page, but making sure your header variables match an actual user request is necessary.一些网站有更高级的技术来确定您是否是人类用户,例如 JavaScript 是否加载在较早的页面上,但确保您的 header 变量与实际用户请求匹配是必要的。

you can use the Google Chrome or Firefox inspector to see what request headers you should be sending.您可以使用 Google Chrome 或 Firefox 检查器查看您应该发送哪些请求标头。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM