简体   繁体   中英

Find highest number in file names on web server

On my webserver, I have a folder with numbered image files:

...
296.jpg
297.png
298.gif
...

The numbers are consecutive (1, 2, 3, ...). The file name contains only the number ("12.jpg", not "photo_12.jpg"). The files may not be created and stored in the order of their file name numbering (ie 2000.jpg might be older than 2.jpg).

I want to find the highest number in the file names.

I do this:

$glob = glob("path/to/dir/*");
$highest = max(preg_replace("|[^0-9]|", "", $glob));
// $highest is now something like 381554

Is there a less resource heavy method?

First of all you have to decide what kind of resources you want to save, because there will be different approaches depending on whether it is memory, IO operations of something else.

So far you solution is the most optimised in terms of working speed, but it's very memory consuming, since there may be a lot of files in the folder and you'll hit the memory limit.

I suggest you cache the max somewhere, in Redis for example. And then update it every time you upload a new image. To cache it you have to fetch it first. You can get the initial max value either with a simple script:

$max = 0;
foreach (new DirectoryIterator('.') as $fileInfo) {
    if ($fileInfo->isDot()) continue;

    $current = pathinfo($fileInfo->getFilename())['filename'];
    if (!is_numeric($current)) continue;
    if ($current > $max) $max = $current;
}

Or with a call to an external sort command as vladyslav-savchenko suggested.

Then you just have to maintain the max value updated. Update it either on every upload, by cron of both.

This may be a working way

$numeric_files=glob("[0-9]*.*");
$slike = array_map(function($e){return pathinfo($e, PATHINFO_FILENAME);}, $numeric_files);
echo max($slike);

Starting with

$path = "path/to/dir/";

Let's get an array of the file

//$myFile

if ($handle = opendir($path)) {
    while (false !== ($entry = readdir($handle))) {
        if ($entry != "." && $entry != "..") {
            if(!is_dir($entry)){
              $myFile[] = substr($entry,0,strrpos($entry, "."));
            }
        }
    }
    closedir($handle);
}

Then we can sort the array

rsort($myFile,SORT_NUMERIC);

The first one will be the one we were searching

print $myFile[0];

This is an example and is untested.

I don't think that this will result in a good solution. Especially with a large number of files what I'm assuming because of your comment that the highest number is about 381k. This will result in high I/O and maybe real performance problem when you've too much visitors and/or a slow/highly loaded server, maybe with an (older) HDD which is common for storing images.

I would recomment you to store the filenames in a database. Even if you're not using a database yet this is the best solution because you can get the highest number with a clean SQL-Query which will causing much less I/O load instead of scanning huge directorys on the filesystem. Further you can profit from indexes which will once more optimize the speed of our database-querys.

It's not neccessary to store the full path and even a bad idea when you've all files in one folder. In this case you'll produce unneccessary redundance which will waste storage and produce extra work when you maybe want to edit the path later. It's better to store only the filenames and create a constant in our config or script for the path like

define('IMAGE_PATH', '/var/www/images');

When you want to proceed with the selected image, you can do something like this:

$fullImagePath = IMAGE_PATH . $databaseQueryResult['fileName'];

I don't know what you want to do but maybe it's a good idea to think about your design when you're not using a database yet. Something in the image-hosting area looks for me like that a database can be a good idea here, also for other features you may want to implement.

You can use something like this:

$path = 'path_to_directory';

$command = 'ls ' . escapeshellarg($path) . ' | sort -rn | head -1';

if (!($output = system($command))) {
    print 'Error during execution of: "' . $command . '"';
}

print $output;

Here is what I was getting at with my comment about a binary search.

It needs no memory and takes just 0.003 seconds and 35 filechecks with 100,000 files.

I guess you could code it in PHP, or shell out to it.

#!/bin/bash
checkfile(){
   if [ -f "$1.jpg" ]; then
      echo DEBUG: Testing ${i}.jpg, exists - so move min marker to $1
      min=$1
      return 0
   else
      echo DEBUG: Testing ${i}.jpg, nope - so move max marker to $1
      max=$1
      return 1
   fi
}
i=1
min=0
max=-1
while : ; do
   if checkfile $i && [[ $max -eq -1 ]]; then
     ((i*=2))
   else
     ((i=(max+min)/2))
   fi
   diff=$((max-min))
   [[ $diff -eq 1 ]] && break
done
echo Result:$min

Output:

DEBUG: Testing 1.jpg, exists - so move min marker to 1
DEBUG: Testing 2.jpg, exists - so move min marker to 2
DEBUG: Testing 4.jpg, exists - so move min marker to 4
DEBUG: Testing 8.jpg, exists - so move min marker to 8
DEBUG: Testing 16.jpg, exists - so move min marker to 16
DEBUG: Testing 32.jpg, exists - so move min marker to 32
DEBUG: Testing 64.jpg, exists - so move min marker to 64
DEBUG: Testing 128.jpg, exists - so move min marker to 128
DEBUG: Testing 256.jpg, exists - so move min marker to 256
DEBUG: Testing 512.jpg, exists - so move min marker to 512
DEBUG: Testing 1024.jpg, exists - so move min marker to 1024
DEBUG: Testing 2048.jpg, exists - so move min marker to 2048
DEBUG: Testing 4096.jpg, exists - so move min marker to 4096
DEBUG: Testing 8192.jpg, exists - so move min marker to 8192
DEBUG: Testing 16384.jpg, exists - so move min marker to 16384
DEBUG: Testing 32768.jpg, exists - so move min marker to 32768
DEBUG: Testing 65536.jpg, exists - so move min marker to 65536
DEBUG: Testing 131072.jpg, nope - so move max marker to 131072
DEBUG: Testing 98304.jpg, exists - so move min marker to 98304
DEBUG: Testing 114688.jpg, nope - so move max marker to 114688
DEBUG: Testing 106496.jpg, nope - so move max marker to 106496
DEBUG: Testing 102400.jpg, nope - so move max marker to 102400
DEBUG: Testing 100352.jpg, nope - so move max marker to 100352
DEBUG: Testing 99328.jpg, exists - so move min marker to 99328
DEBUG: Testing 99840.jpg, exists - so move min marker to 99840
DEBUG: Testing 100096.jpg, nope - so move max marker to 100096
DEBUG: Testing 99968.jpg, exists - so move min marker to 99968
DEBUG: Testing 100032.jpg, nope - so move max marker to 100032
DEBUG: Testing 100000.jpg, exists - so move min marker to 100000
DEBUG: Testing 100016.jpg, nope - so move max marker to 100016
DEBUG: Testing 100008.jpg, nope - so move max marker to 100008
DEBUG: Testing 100004.jpg, nope - so move max marker to 100004
DEBUG: Testing 100002.jpg, nope - so move max marker to 100002
DEBUG: Testing 100001.jpg, nope - so move max marker to 100001
Result:100000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM