[英]How do you get the HTTP status code for a remote domain in php?
I would like to create a batch script, to go through 20,000 links in a DB, and weed out all the 404s and such. 我想创建一个批处理脚本,在数据库中浏览20,000个链接,并清除所有404等。 How would I get the HTTP status code for a remote url?
我如何获取远程URL的HTTP状态代码?
Preferably not using curl, since I dont have it installed. 最好不要使用卷曲,因为我没有安装它。
CURL would be perfect but since you don't have it, you'll have to get down and dirty with sockets. CURL会很完美但是因为你没有它,你将不得不陷入肮脏的插座。 The technique is:
该技术是:
Here is a quick example: 这是一个简单的例子:
<?php
$url = parse_url('http://www.example.com/index.html');
$host = $url['host'];
$port = $url['port'];
$path = $url['path'];
$query = $url['query'];
if(!$port)
$port = 80;
$request = "HEAD $path?$query HTTP/1.1\r\n"
."Host: $host\r\n"
."Connection: close\r\n"
."\r\n";
$address = gethostbyname($host);
$socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
socket_connect($socket, $address, $port);
socket_write($socket, $request, strlen($request));
$response = split(' ', socket_read($socket, 1024));
print "<p>Response: ". $response[1] ."</p>\r\n";
socket_close($socket);
?>
UPDATE: I've added a few lines to parse the URL 更新:我添加了几行来解析URL
If im not mistaken none of the php built-in functions return the http status of a remote url, so the best option would be to use sockets to open a connection to the server, send a request and parse the response status: 如果我没有弄错,没有任何php内置函数返回远程url的http状态,所以最好的选择是使用套接字打开与服务器的连接,发送请求并解析响应状态:
pseudo code: 伪代码:
parse url => $host, $port, $path
$http_request = "GET $path HTTP/1.0\nHhost: $host\n\n";
$fp = fsockopen($host, $port, $errno, $errstr, $timeout), check for any errors
fwrite($fp, $request)
while (!feof($fp)) {
$headers .= fgets($fp, 4096);
$status = <parse $headers >
if (<status read>)
break;
}
fclose($fp)
Another option is to use an already build http client class in php that can return the headers without fetching the full page content, there should be a few open source classes available on the net... 另一种选择是在php中使用已经构建的http客户端类,它可以在不获取整页内容的情况下返回标题,网上应该有一些开源类...
This page looks like it has a pretty good setup to download a page using either curl or fsockopen, and can get the HTTP headers using either method (which is what you want, really). 这个页面看起来有一个非常好的设置来使用curl或fsockopen下载页面,并且可以使用任一方法获取HTTP头(这是你想要的,真的)。
After using that method, you'd want to check $output['info']['http_code'] to get the data you want. 使用该方法后,您需要检查$ output ['info'] ['http_code']以获取所需的数据。
Hope that helps. 希望有所帮助。
You can use PEAR's HTTP::head function. 您可以使用PEAR的HTTP :: head函数。
http://pear.php.net/manual/en/package.http.http.head.php http://pear.php.net/manual/en/package.http.http.head.php
http://www.webmasterworld.com/forum88/12559.htm a quick bit of googling found this link. http://www.webmasterworld.com/forum88/12559.htm谷歌搜索一下这个链接。 The most up-to date version is near the bottom.
最新版本接近底部。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.