How can I use perl to delete files matching a regex

Question

Due to a Makefile mistake, I have some fake files in my git repo...

$ ls
=0.1.1                  =4.8.0                  LICENSE
=0.5.3                  =5.2.0                  Makefile
=0.6.1                  =7.1.0                  pyproject.toml
=0.6.1,                 all_commands.txt        README_git_workflow.md
=0.8.1                  CHANGES.md              README.md
=1.2.0                  ciscoconfparse/         requirements.txt
=1.7.0                  configs/                sphinx-doc/
=2.0                    CONTRIBUTING.md         tests/
=2.2.0                  deploy_docs.py          tutorial/
=22.2.0                 dev_tools/              utils/
=22.8.0                 do.py
=2.7.0                  examples/
$

I tried this, but it seems that there may be some more efficient means to accomplish this task ...

# glob "*" will list all files globbed against "*"
foreach my $filename (grep { /\W\d+\.\d+/ } glob "*") {
    my $cmd1 = "rm $filename";
    `$cmd1`;
}

Question:

I want a remove command that matches against a pcre.
What is a more efficient perl solution to delete the files matching this perl regex: /\W\d+\.\d+/ (example filename: '=0.1.1') ?

Answer 1

Fetch a wider set of files and then filter through whatever you want

my @files_to_del = grep { /^\W[0-9]+\.[0-9]+/ and not -d } glob "$dir/*";

I added an anchor ( ^ ) so that the regex can only match a string that begins with that pattern, otherwise this can blow away files other than intended. Reconsider what exactly you need.

Altogether perhaps (or see a one-liner below ^† )

use warnings;
use strict;
use feature 'say';

use File::Glob ':bsd_glob';  # for better glob()
use Cwd qw(cwd);             # current-working-directory

my $dir = shift // cwd;      # cwd by default, or from input 

my $re = qr/^\W[0-9]+\.[0-9]+/;  

my @files_to_del = grep { /$re/ and not -d } glob "$dir/*"; 

say for @files_to_del;  # please inspect first

#unlink or warn "Can't unlink $_: $!" for @files_to_del;

where that * in glob might as well have some pre-selection, if suitable. In particular, if the = is a literal character (and not an indicator printed by the shell, see footnote) ^‡ then glob "=*" will fetch files starting with it, and then you can pass those through a grep filter.

I exclude directories, identified by -d filetest , since we are looking for files (and to not mix with some scary language about directories from unlink , thanks to brian d foy comment).

If you'd need to scan subdirectories and do the same with them, perhaps recursively -- what doesn't seem to be the case here? -- then we could employ this logic in File::Find::find (or File::Find::Rule , or yet others).

Or read the directory any other way ( opendir + readdir , libraries like Path::Tiny ), and filter.

^† Or, a quick one-liner... print (to inspect) what's about to get blown away

perl -wE'say for grep { /^\W[0-9]+\.[0-9]+/ and not -d } glob "*"'

and then delete 'em

perl -wE'unlink or warn "$_: $!" for grep /^\W[0-9]+\.[0-9]+/ && !-d, glob "*"'

(I switched to a more compact syntax just so. Not necessary)

If you'd like to be able to pass a directory to it (optionally, or work in the current one) then do

perl -wE'$d = shift//q(.); ...'  dirpath (relative path fine. optional)

and then use glob "$d/*" in the code. This works the same way as in the script above -- shift pulls the first element from @ARGV , if anything was passed to the script on the command line, or if @ARGV is empty it returns undef and then // ( defined-or ) operator picks up the string q(.) .

^‡ That leading = may be an "indicator" of a file type if ls has been aliased with ls -F , what can be checked by running ls with suppressed aliases, one way being \ls (or check alias ls ).

If that is so, the = stands for it being a socket, what in Perl can be tested for by the -S filetest.

Then that \W in the proposed regex may need to be changed to \W? to allow for no non-word characters preceding a digit, along with a test for a socket. Like

my $re = qr/^\W? [0-9]+ \. [0-9]+/x;

my @files_to_del = grep { /$re/ and -S } glob "$dir/*";

Answer 2

Why not just:

$ rm =*

Sometimes, shell commands are the best option.

Answer 3

In these cases, I use perl to merely filter the list of files:

ls | perl -ne 'print if /\A\W\d+\.\d+/a' | xargs rm

And, when I do that, I feel guilty for not doing something simpler with an extended pattern in grep :

ls | grep -E '^\W\d+\.\d+' | xargs rm

Eventually I'll run into a problem where there's a directory so I need to be more careful about the file list:

find . -type f  -maxdepth 1 | grep -E '^\./\W\d+\.\d+' | xargs rm

Or I need to allow rm to remove directories too should I want that:

ls | grep -E '^\W\d+\.\d+' | xargs rm -r

Answer 4

Here you go.

unlink( grep { /\W\d+\.\d+/ && !-d } glob( "*" ) );

This matches the filename, and excludes directories.

Answer 5

To delete filenames matching this: /\W\d+\.\d+/ pcre , use the following one-liners...

1> $fn is a filename... I'm also removing the my keywords since the one-liner doesn't have to worry about perl lexical scopes :

perl -e 'foreach $fn (grep { /\W\d+\.\d+/ } glob "*") {$cmd1="rm $fn";`$cmd1`;}'

2> Or as Andy Lester responded , perhaps his answer is as efficient as we can make it...

perl -e 'unlink(grep { /\W\d+\.\d+/ } glob "*");'

How can I use perl to delete files matching a regex

Question

Question:

5 answers

solution1
4 ACCPTED 2022-10-06 16:44:44

solution2
3 2022-10-06 13:26:36

solution3
2 2022-10-07 17:06:07

solution4
1 2022-10-07 20:26:11

solution5
0 2022-10-06 12:24:04

How can I use perl to delete files matching a regex

Question

Question:

5 answers

solution1 4 ACCPTED 2022-10-06 16:44:44

solution2 3 2022-10-06 13:26:36

solution3 2 2022-10-07 17:06:07

solution4 1 2022-10-07 20:26:11

solution5 0 2022-10-06 12:24:04

solution1
4 ACCPTED 2022-10-06 16:44:44

solution2
3 2022-10-06 13:26:36

solution3
2 2022-10-07 17:06:07

solution4
1 2022-10-07 20:26:11

solution5
0 2022-10-06 12:24:04