I am writing a PHP program that downloads a pdf from a backend and save to a local drive. Now how do I check whether the file exists before downloading?
Currently I am using curl (see code below) to check and download but it still downloads the file which is 1KB in size.
$url = "http://wedsite/test.pdf";
$path = "C:\\test.pdf;"
downloadAndSave($url,$path);
function downloadAndSave($urlS,$pathS)
{
$fp = fopen($pathS, 'w');
$ch = curl_init($urlS);
curl_setopt($ch, CURLOPT_FILE, $fp);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo $httpCode;
//If 404 is returned, then file is not found.
if(strcmp($httpCode,"404") == 1)
{
echo $httpCode;
echo $urlS;
}
fclose($fp);
}
I want to check whether the file exists before even downloading. Any idea how to do it?
You can do this with a separate curl HEAD
request:
curl_setopt($ch, CURLOPT_NOBODY, true);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
When you actually want to download you can use set NOBODY
to false
.
Since you are using HTTP to fetch a resource on the internet, what you really want to check is that the return code is a 404.
On some PHP installations, you can just use file_exists($url)
out of the box. This does not work in all environments, however. http://www.php.net/manual/en/wrappers.http.php
Here is a function much like file_exists
but for URLs, using curl:
<?php function curl_exists()
$file_headers = @get_headers($url);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = false;
}
else {
$exists = true;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#75064
Sometimes the CURL extension isn't installed with PHP. In that case you can still use the socket library in the PHP core:
<?php function url_exists($url) {
$a_url = parse_url($url);
if (!isset($a_url['port'])) $a_url['port'] = 80;
$errno = 0;
$errstr = '';
$timeout = 30;
if(isset($a_url['host']) && $a_url['host']!=gethostbyname($a_url['host'])){
$fid = fsockopen($a_url['host'], $a_url['port'], $errno, $errstr, $timeout);
if (!$fid) return false;
$page = isset($a_url['path']) ?$a_url['path']:'';
$page .= isset($a_url['query'])?'?'.$a_url['query']:'';
fputs($fid, 'HEAD '.$page.' HTTP/1.0'."\r\n".'Host: '.$a_url['host']."\r\n\r\n");
$head = fread($fid, 4096);
$head = substr($head,0,strpos($head, 'Connection: close'));
fclose($fid);
if (preg_match('#^HTTP/.*\s+[200|302]+\s#i', $head)) {
$pos = strpos($head, 'Content-Type');
return $pos !== false;
}
} else {
return false;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#73175
An even faster function can be found here: http://www.php.net/manual/en/function.file-exists.php#76246
Call this before your download function and it's done:
<?php function remoteFileExists($url) {
$curl = curl_init($url);
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
$ret = false;
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
?>
In the first example above $file_headers[0] may contain more than or something other than 'HTTP/1.1 404 Not Found', eg:
HTTP/1.1 404 Document+%2Fdb%2Fscotbiz%2Freports%2FR20131212%2Exml+not+found
So it's important to use some other test, eg, regex, as '==' is not reliable.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.