简体   繁体   English

使用file_get_contents时忽略Content-Length标头

[英]Ignore Content-Length header when using file_get_contents

I need to get the contents of a page, which always sends a Content-Length: 0 header, however the page is never empty. 我需要获取页面的内容,该页面始终发送Content-Length: 0标头,但是页面永远不会为空。

The file_get_contents(url) just returns an empty string. file_get_contents(url)仅返回一个空字符串。

The whole header returned by the page is: 页面返回的整个标题为:

HTTP/1.1 200 OK
X-Powered-By: PHP/5.3.10
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Sat, 18 Feb 2012 18:14:59 GMT
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Pragma: no-cache
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Date: Sat, 18 Feb 2012 18:14:59 GMT
Server: lighttpd

Would it be possible to use file_get_contents and ignore the header or do I need to use curl? 是否可以使用file_get_contents并忽略标题,还是需要使用curl?

Edit 编辑

get_headers(url) output (using print_r ): get_headers(url)输出(使用print_r ):

Array
(
    [0] => HTTP/1.0 200 OK
    [1] => X-Powered-By: PHP/5.3.10
    [2] => Content-type: text/html
    [3] => Content-Length: 0
    [4] => Connection: close
    [5] => Date: Sat, 18 Feb 2012 22:39:52 GMT
    [6] => Server: lighttpd
)

As noted by Optimist the problem had nothing to do with the headers, but rather that I didn't send any User-Agent header to the server. 正如Optimist指出的那样,该问题与头无关,而是我没有将任何User-Agent头发送到服务器。

file_get_contents worked perfectly after sending User-Agent headers, even though the server always returns Content-Length: 0 . 即使服务器始终返回Content-Length: 0 ,在发送User-Agent标头后, file_get_contents工作。

Weird. 奇怪的。

I believe, that none of HTTP-level functions can not read such an answer. 我相信,没有HTTP级别的函数无法读取这样的答案。 Because it is incorrect HTTP answer, it says "my body is empty, dont read it" 因为它是不正确的HTTP答案,所以说“我的身体是空的,不要读”

You definitely need your own function based on fread, which will phisically read the socket. 您绝对需要基于fread的自己的函数,该函数会以物理方式读取套接字。 Something like this: 像这样:

$aURL    = parse_url($sURL);

if ($iHandle = fsockopen($aURL["host"], 80, $iError, $sError))
{
    $sQuery = substr($sURL, strpos($sURL, $aURL["host"]) + strlen($aURL["host"]));

    $sOut   = "GET " . (($sQuery != "") ? $sQuery : "/") . " HTTP/1.1\r\n";
    $sOut  .= "Host: " . $aURL["host"] . "\r\n";
    $sOut  .= "Connection: Close\r\n\r\n";

    fputs($iHandle, $sOut);

    while (!feof($iHandle))
    {
        $sResult .= fread($iHandle, 1024);
    }
}

Then just cut the headers. 然后仅剪切标题即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM