简体   繁体   English

PHP Curl在下载之前检查文件是否存在

[英]PHP Curl check for file existence before downloading

I am writing a PHP program that downloads a pdf from a backend and save to a local drive. 我正在编写一个PHP程序,从后端下载pdf并保存到本地驱动器。 Now how do I check whether the file exists before downloading? 现在如何在下载之前检查文件是否存在?

Currently I am using curl (see code below) to check and download but it still downloads the file which is 1KB in size. 目前我正在使用curl(请参阅下面的代码)进行检查和下载,但它仍然会下载大小为1KB的文件。

$url = "http://wedsite/test.pdf";
$path = "C:\\test.pdf;"
downloadAndSave($url,$path);

function downloadAndSave($urlS,$pathS)
    {
        $fp = fopen($pathS, 'w');

        $ch = curl_init($urlS);

        curl_setopt($ch, CURLOPT_FILE, $fp);
        $data = curl_exec($ch);

        $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        echo $httpCode;
        //If 404 is returned, then file is not found.
        if(strcmp($httpCode,"404") == 1)
        {
            echo $httpCode;
            echo $urlS; 
        }

        fclose($fp);

    }

I want to check whether the file exists before even downloading. 我想在下载之前检查文件是否存在。 Any idea how to do it? 知道怎么做吗?

You can do this with a separate curl HEAD request: 您可以使用单独的curl HEAD请求执行此操作:

curl_setopt($ch, CURLOPT_NOBODY, true);
$data = curl_exec($ch);

$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

When you actually want to download you can use set NOBODY to false . 当您真正想要下载时,您可以使用设置NOBODYfalse

Since you are using HTTP to fetch a resource on the internet, what you really want to check is that the return code is a 404. 由于您使用HTTP来获取Internet上的资源,您真正想要检查的是返回码是404。

On some PHP installations, you can just use file_exists($url) out of the box. 在某些PHP安装中,您可以直接使用file_exists($url) This does not work in all environments, however. 但是,这并不适用于所有环境。 http://www.php.net/manual/en/wrappers.http.php http://www.php.net/manual/en/wrappers.http.php

Here is a function much like file_exists but for URLs, using curl: 这里的函数很像file_exists但对于URL,使用curl:

<?php function curl_exists()
  $file_headers = @get_headers($url);
  if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
    $exists = false;
  }
  else {
    $exists = true;
  }
} ?>

source: http://www.php.net/manual/en/function.file-exists.php#75064 来源: http//www.php.net/manual/en/function.file-exists.php#75064

Sometimes the CURL extension isn't installed with PHP. 有时CURL扩展没有安装PHP。 In that case you can still use the socket library in the PHP core: 在这种情况下,您仍然可以在PHP核心中使用套接字库:

<?php function url_exists($url) {
       $a_url = parse_url($url);
       if (!isset($a_url['port'])) $a_url['port'] = 80;
       $errno = 0;
       $errstr = '';
       $timeout = 30;
       if(isset($a_url['host']) && $a_url['host']!=gethostbyname($a_url['host'])){
           $fid = fsockopen($a_url['host'], $a_url['port'], $errno, $errstr, $timeout);
           if (!$fid) return false;
           $page = isset($a_url['path'])  ?$a_url['path']:'';
           $page .= isset($a_url['query'])?'?'.$a_url['query']:'';
           fputs($fid, 'HEAD '.$page.' HTTP/1.0'."\r\n".'Host: '.$a_url['host']."\r\n\r\n");
           $head = fread($fid, 4096);
           $head = substr($head,0,strpos($head, 'Connection: close'));
           fclose($fid);
           if (preg_match('#^HTTP/.*\s+[200|302]+\s#i', $head)) {
            $pos = strpos($head, 'Content-Type');
            return $pos !== false;
           }
       } else {
           return false;
       }
   } ?>

source: http://www.php.net/manual/en/function.file-exists.php#73175 来源: http//www.php.net/manual/en/function.file-exists.php#73175

An even faster function can be found here: http://www.php.net/manual/en/function.file-exists.php#76246 可以在这里找到更快的功能: http//www.php.net/manual/en/function.file-exists.php#76246

Call this before your download function and it's done: 在下载功能之前调用它并完成:

<?php function remoteFileExists($url) {
    $curl = curl_init($url);

    //don't fetch the actual page, you only want to check the connection is ok
    curl_setopt($curl, CURLOPT_NOBODY, true);

    //do request
    $result = curl_exec($curl);

    $ret = false;

    //if request did not fail
    if ($result !== false) {
        //if request was ok, check response code
        $statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);  

        if ($statusCode == 200) {
            $ret = true;   
        }
    }

    curl_close($curl);

    return $ret;
}

?> ?>

In the first example above $file_headers[0] may contain more than or something other than 'HTTP/1.1 404 Not Found', eg: 在上面的第一个示例中,$ file_headers [0]可能包含除“HTTP / 1.1 404 Not Found”之外的其他内容,例如:

HTTP/1.1 404 Document+%2Fdb%2Fscotbiz%2Freports%2FR20131212%2Exml+not+found

So it's important to use some other test, eg, regex, as '==' is not reliable. 所以使用其他测试很重要,例如正则表达式,因为'=='不可靠。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM