简体   繁体   English

文本错误404错误的最有效方法是什么

[英]What is the most efficient way to text URLs for 404 errors

I'm interested to learn what is the best / leanest way to test URLs for server response codes such as 404s. 我有兴趣了解什么是测试服务器响应代码(例如404s)的URL的最好/最精简的方法。 I am currently using something very similar to what can be found in the comments of the php manual for get_headers: 我目前正在使用非常类似于get_headers的php手册注释中的内容:

<?php
function get_http_response_code($theURL) {
    $headers = get_headers($theURL);
    return substr($headers[0], 9, 3);
}

if(intval(get_http_response_code('filename.jpg')) < 400){
// File exists, huzzah!
}
?>

But, using this scaled for more than 50+ URLs in a foreach routine typically causes my server give up and report a 500 response (excuse vagueness on the exact error). 但是,在foreach例程中使用超过50个以上URL的缩放比例,通常会导致我的服务器放弃并报告500个响应(对确切错误表示模糊)。 So, I wonder if there is a method that is less resource heavy, and can check URL response codes on mass? 因此,我想知道是否有一种方法不那么占用大量资源,并且可以大量检查URL响应代码吗?

You could execute several curl requests at the same time using curl_multi_* functions. 您可以使用curl_multi_*函数同时执行多个curl请求。

However, this would still block execution until the slowest request returned (and some additional time for response parsing). 但是,这仍然会阻止执行,直到最慢的请求返回(以及一些额外的时间来进行响应解析)为止。

Tasks like this should be executed in the background using cronjobs or simliar alternatives. 此类任务应使用cronjobs或类似的替代方法在后台执行。

Additonally there are multiple libraries on github and co.,which wrap the curl extension to provide a nicer api. 此外,github和co。上还有多个库,它们包装curl扩展以提供更好的api。

The concept resolves to this: ( cpu "fix" by Ren@php-docs ) 这个概念解决了这个问题:( Ren @ php-docs的cpu“ fix”

function getStatusCodes(array $urls, $useHead = true) {
    $handles = [];
    foreach($urls as $url) {
        $options = [
            CURLOPT_URL => $url,
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_NOBODY => $useHead,
            CURLOPT_FOLLOWLOCATION => true,
            CURLOPT_HEADER => 0
        ];
        $handles[$url] = curl_init();
        curl_setopt_array($handles[$url], $options);
    }

    $mh = curl_multi_init();

    foreach($handles as $handle) {
        curl_multi_add_handle($mh, $handle);
    }

    $running = null;
    do {
        curl_multi_exec($mh, $running);
        curl_multi_select($mh);
    } while ($running > 0);

    $return = [];
    foreach($handles as $handle) {
        $return[$eUrl = curl_getinfo($handle, CURLINFO_EFFECTIVE_URL)] = [
            'url' => $eUrl,
            'status' => curl_getinfo($handle, CURLINFO_HTTP_CODE) 
        ];
        curl_multi_remove_handle($mh, $handle);
        curl_close($handle);
    }
    curl_multi_close($mh);

    return $return; 
}

var_dump(getStatusCodes(['http://google.de', 'http://stackoverflow.com', 'http://google.de/noone/here']));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM