简体   繁体   中英

Sort file based on name using perl

I have a source folder where many files are there with different name:

20160311_TXT_XPL_SLA_Attribution

20160301_TXT_APL_SLA_Attribution

20160301_TXT_XPL_SLA_Attribution

20160302_TXT_APL_SLA_Attribution

I have to sort the files based on letters between TXT_***_SLA.

I wrote the Perl script but this is not properly sorting:

    #!/usr/bin/perl

    $dir = "E://Unix";
    my $file;
    my @files;
    opendir (DIR, "$dir");
    while ($file = readdir(DIR)) 
    {
      push (@files, $file);
    }


    print 
         map  { $_->[1] } 
         sort 
         map  { /TXT(.*)SLA/; [$1, $_] }
         @files;

   foreach $file (@files) 
   {
     print "$file\n";
   }

   closedir(DIR);

Even I checked the after removing the underscores but don't see any change in sorting patter. I am really new to Perl & it will be great help if someone can tell me where I am going wrong?

Output is coming:

20160301_TXT_APL_SLA_Attribution.txt

20160301_TXT_XPL_SLA_Attribution.txt

20160302_TXT_APL_SLA_Attribution.txt

20160311_TXT_XPL_SLA_Attribution.txt

Expected is:

20160301_TXT_APL_SLA_Attribution.txt

20160302_TXT_APL_SLA_Attribution.txt

20160301_TXT_XPL_SLA_Attribution.txt

20160311_TXT_XPL_SLA_Attribution.txt

Regex used:

/(TXT)(.*)(SLA)/

There's two problems, both here:

print 
     map  { $_->[1] } 
     sort 
     map  { /TXT(.*)SLA/; [$1, $_] }
     @files;

First, your Schwartzian Transform is lacking a sort function. So it's sorting the string versions of your array references like ARRAY(0x7ff730805468) . You need to add something like sort { $a->[0] cmp $b->[0] } .

Second, sort does not happen in place. The output has to be assigned back to @files .

A Schwartzian Transform is a useful optimisation only when the data set is huge or the sorting function is complex and slow; otherwise it just makes for unclear code. So it's a shame that it has become the go-to pattern any time someone wants to sort by a function of the data instead of the data itself

There are a couple of alternatives, and you may prefer a standard sort function like this. The relevant parts of $a and $b are extracted into $aa and $bb respectively, and then they are simply compared

use strict;
use warnings 'all';
use feature 'say';

chomp( my @data = <DATA> );

say for sort {
    my ($aa, $bb) = map { /TXT_([A-Z]+)_SLA/ } $a, $b;
    $aa cmp $bb;
} @data;

__DATA__
20160311_TXT_XPL_SLA_Attribution
20160301_TXT_APL_SLA_Attribution
20160301_TXT_XPL_SLA_Attribution
20160302_TXT_APL_SLA_Attribution

output

20160301_TXT_APL_SLA_Attribution
20160302_TXT_APL_SLA_Attribution
20160311_TXT_XPL_SLA_Attribution
20160301_TXT_XPL_SLA_Attribution

From the documentation of sort :

If SUBNAME or BLOCK is omitted, sorts in standard string comparison order.

So the following code

 sort 
 map  { /TXT(.*)PLA/; [$1, $_] }
 @files;

sorts the arrayref values returned from map by their stringified value (something like ARRAY(0x22bcd48) ). The following should sort the arrayrefs by their first element:

 sort { $a->[0] cmp $b->[0] }
 map  { /TXT(.*)PLA/; [$1, $_] }
 @files;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM