[英]Multi Curl, error handling
I'm working with multi Curl and was wondering how to handle the errors.我正在使用 multi Curl 并且想知道如何处理错误。 I want to check which error occured and if it is an error like, rate limit exceeded I want to crawl that link again after some delay (sleep()).我想检查发生了哪个错误,如果它是一个错误,例如超出速率限制我想在延迟一段时间后再次抓取该链接(睡眠())。 My question: "Is there a build in function which can do this for me or do I need to collect all Urls in an array and just run those again?"我的问题:“是否有内置函数可以为我执行此操作,还是我需要将所有 Url 收集到一个数组中并再次运行它们?”
This is what I've got now:这就是我现在所拥有的:
<?php
$urls = array( "https://API-URL.com",
"https://API-URL.com",
"https://API-URL.com",
"https://API-URL.com",
...);
//create the multiple cURL handle
$mh = curl_multi_init();
//Number of elements in $urls
$nbr = count($urls);
// set URL and options
for($x = 0; $x < $nbr; $x++){
// create both cURL resources
$ch[$x] = curl_init();
// set URL and other appropriate options
curl_setopt($ch[$x], CURLOPT_URL, $urls[$x]);
curl_setopt($ch[$x], CURLOPT_RETURNTRANSFER, true );
curl_setopt($ch[$x], CURLOPT_SSL_VERIFYPEER, false);
//add the two handles
curl_multi_add_handle($mh,$ch[$x]);
}
//execute the handles
do {
curl_multi_exec($mh, $running);
} while ($running);
for($x = 0; $x < $nbr; $x++){
$result = curl_multi_getcontent($ch[$x]);
$decoded = json_decode($result, true);
//get info about the request
$error = curl_getinfo($ch[$x], CURLINFO_HTTP_CODE);
//error handling
if($error != 200){
$again[] = array("Url" => $urls[$x], "errornbr" => $error);
} else {
// Here I do what ever I want with the data
}
curl_multi_remove_handle($mh, $ch[$x]);
curl_close($ch[1]);
}
curl_multi_close($mh);
?>
In the second for-loop, when you are cycling through the curl handlers to examine what did each curl handler return, I hope, this approach will answer you question 在第二个for循环中,当您循环浏览各个curl处理程序以检查每个curl处理程序返回了什么内容时,希望这种方法可以回答您的问题
foreach ($ch as $key => $h) {
//This code is actually checking for any error that may occur, whatever that
//error is you can handle it in the if-part of the condition. and save those
//urls to the array $again to call them on a later stage.
if (curl_errno($h)) {
//this is how you will get complete information what did happened to the
//curl handler. And why did it fail. All the inforation will be stored in //error_info.
$again[] = array("Url" =>curl_getinfo($h, CURLINFO_EFFECTIVE_URL), "error_info" => curl_getinfo($h));
}
else{
//here you will handle the success scenario for each curl handler.
$responses[$key] = ['data' => curl_multi_getcontent($h)];
}
//remove curl handler as you are doing in the loop
}
For multiple handles there ishttps://www.php.net/manual/en/function.curl-multi-info-read.php对于多个句柄有https://www.php.net/manual/en/function.curl-multi-info-read.php
so error check (assuming http connection) should look like:所以错误检查(假设 http 连接)应该如下所示:
while ($a = curl_multi_info_read($mh))
{
if ($b = $a['result'])
{
echo curl_strerror($b);# CURLE_* error
}
elseif (!($b = curl_getinfo($a['handle'], CURLINFO_RESPONSE_CODE)))
{
echo 'connection failed';
}
elseif ($b !== 200)
{
echo 'HTTP status is not 200 OK';
}
}
Consider this code as pseudo-code for modern PHPs (i didn't test this exact variant, but scheme will work).将此代码视为现代 PHP 的伪代码(我没有测试这个确切的变体,但方案会起作用)。 Calling curl_errno()
on "easy" handles added to "multi" handle will return 0
which is not an error.在添加到“multi”句柄的“easy”句柄上调用curl_errno()
将返回0
,这不是错误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.