简体   繁体   中英

Fast directory clean-up with Perl

I have a need to clean-up directory with millions of log files on my webserver. And I've found this great article on how to do this. There is, however, a couple interesting things in that one-liner, which I am interested in.

Here's the Perl code I am interested in:

for(<*>){((stat)[9]<(unlink))}

Runned with perl -e 'code' .

So, here are my questions:

  1. the for(<*>) construction - I assume it iterates through the files in the current directory. But where does it store the iterator?
  2. the stat and unlink functions expect at least one argument, I assume... But where is it?
  3. why the result of calling (stat)[9] is compared to the result of calling (unlink) ? And what does it results in?

Sorry, I am a no-perl-ish guy, thus I do not understand all those Perl abbreviations. That's why I am asking this question.

Thanks!

That one liner takes many shortcuts:

  1. The <*> is a special case of the diamond operator. You can't access an iterator object, like in other languages. Here, it calls the glob function. In list context it returns a list from all the results (which are either lines of a file, or, as in your case, contents of a diretory. The return value of that is passed to for which iterates over a list and aliases the values in $_ . $_ is the "default variable" for many functions…
  2. Which brings us here. Many core functions default to $_ with no argument. So do unlink and stat .
  3. (stat)[9] means execute stat in list context and select the 10th result (indices start at zero, this is the modify time). (compare that to an array access like $foo[9] ).

The code

for(<*>){((stat)[9]<(unlink))}

is equivalent to:

for my $file (<*>) {
    my $mtime = (stat($file))[9];
    $mtime < unlink($file);
}

<*> can also be replaced with glob "*" which might be more readable.

The code will delete all files in the current directory. It will not delete directories.

Note that the last statement in the loop is completely redundant. If use warnings is in effect, it will give the warning:

Useless use of numeric lt (<) in void context

For this code to make sense, I would expect a comparison that actually matters, like comparing $mtime to some time to know which logs are old, eg:

if ($mtime < $oldtime) {
    unlink $file or die "Cannot unlink $file: $!";
}

Note also that it might be prudent to check for failure when deleting files.

  1. the for(<*>) construction - I assume it iterates through the files in the current directory. But where does it store the iterator?

for-loops can be used to iterate over arrays/lists, so if <*> produces a list, then your code is just a run of the mill for loop. As it turns out <*> is another way to spell glob(), which is sort of like a regex for retrieving file names, and glob() returns a list in list context --which is the context a for loop provides. See: http://perldoc.perl.org/functions/glob.html .

Note that the single quotes keep the shell from expanding the * , which would prevent perl from ever seeing it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM