简体   繁体   English

你如何在PHP中获取远程域的HTTP状态代码?

[英]How do you get the HTTP status code for a remote domain in php?

I would like to create a batch script, to go through 20,000 links in a DB, and weed out all the 404s and such. 我想创建一个批处理脚本,在数据库中浏览20,000个链接,并清除所有404等。 How would I get the HTTP status code for a remote url? 我如何获取远程URL的HTTP状态代码?

Preferably not using curl, since I dont have it installed. 最好不要使用卷曲,因为我没有安装它。

CURL would be perfect but since you don't have it, you'll have to get down and dirty with sockets. CURL会很完美但是因为你没有它,你将不得不陷入肮脏的插座。 The technique is: 该技术是:

  1. Open a socket to the server. 打开服务器的套接字。
  2. Send an HTTP HEAD request. 发送HTTP HEAD请求。
  3. Parse the response. 解析响应。

Here is a quick example: 这是一个简单的例子:

<?php

$url = parse_url('http://www.example.com/index.html');

$host = $url['host'];
$port = $url['port'];
$path = $url['path'];
$query = $url['query'];
if(!$port)
    $port = 80;

$request = "HEAD $path?$query HTTP/1.1\r\n"
          ."Host: $host\r\n"
          ."Connection: close\r\n"
          ."\r\n";

$address = gethostbyname($host);
$socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
socket_connect($socket, $address, $port);

socket_write($socket, $request, strlen($request));

$response = split(' ', socket_read($socket, 1024));

print "<p>Response: ". $response[1] ."</p>\r\n";

socket_close($socket);

?>

UPDATE: I've added a few lines to parse the URL 更新:我添加了几行来解析URL

If im not mistaken none of the php built-in functions return the http status of a remote url, so the best option would be to use sockets to open a connection to the server, send a request and parse the response status: 如果我没有弄错,没有任何php内置函数返回远程url的http状态,所以最好的选择是使用套接字打开与服务器的连接,发送请求并解析响应状态:

pseudo code: 伪代码:

parse url => $host, $port, $path
$http_request = "GET $path HTTP/1.0\nHhost: $host\n\n";
$fp = fsockopen($host, $port, $errno, $errstr, $timeout), check for any errors
fwrite($fp, $request)
while (!feof($fp)) {
   $headers .= fgets($fp, 4096);
   $status = <parse $headers >
   if (<status read>)
     break;
}
fclose($fp)

Another option is to use an already build http client class in php that can return the headers without fetching the full page content, there should be a few open source classes available on the net... 另一种选择是在php中使用已经构建的http客户端类,它可以在不获取整页内容的情况下返回标题,网上应该有一些开源类...

This page looks like it has a pretty good setup to download a page using either curl or fsockopen, and can get the HTTP headers using either method (which is what you want, really). 这个页面看起来有一个非常好的设置来使用curl或fsockopen下载页面,并且可以使用任一方法获取HTTP头(这是你想要的,真的)。

After using that method, you'd want to check $output['info']['http_code'] to get the data you want. 使用该方法后,您需要检查$ output ['info'] ['http_code']以获取所需的数据。

Hope that helps. 希望有所帮助。

You can use PEAR's HTTP::head function. 您可以使用PEAR的HTTP :: head函数。
http://pear.php.net/manual/en/package.http.http.head.php http://pear.php.net/manual/en/package.http.http.head.php

http://www.webmasterworld.com/forum88/12559.htm a quick bit of googling found this link. http://www.webmasterworld.com/forum88/12559.htm谷歌搜索一下这个链接。 The most up-to date version is near the bottom. 最新版本接近底部。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM