简体   繁体   中英

Sort a find command to respect a custom order in Unix

I have a script that outputs file paths (via find ), which I want to sort based on very specific custom logic :

  • 1st sort key: I want the 2nd and, if present, the 3rd - -separated field to be sorted using custom ordering based on a list of keys I supply - but excluding a numerical suffix.
    With the sample input below, the list of keys is:
    rp,alpha,beta-ri,beta-rs,RC

  • 2nd sort key: numeric sorting by the trailing number on each line.

Given the following sample input (note that the /foo/bar/test/example/8.2.4.0 prefix of each line is incidental):

/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-rp2

I expect:

/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10

Using a variant of my answer to your original question :

./your-script | awk -v keysInOrder='rp,alpha,beta-ri,beta-rs,RC' '
    BEGIN {
      FS=OFS="-"
      keyCount = split(keysInOrder, a, ",")
      for (i = 1; i <= keyCount; ++i) keysToOrdinal[a[i]] = i
    }
    { 
      sortKey = $2
      if (NF == 3) sortKey = sortKey FS $3
      sub(/[0-9]+$/, "", sortKey)
      auxFieldPrefix = "|" FS
      if (NF == 2) auxFieldPrefix = auxFieldPrefix FS
      sub(/[0-9]/, auxFieldPrefix "&", $NF)
      sortOrdinal = sortKey in keysToOrdinal ? keysToOrdinal[sortKey] : keyCount + 1
      print sortOrdinal, $0
    }
'  | sort -t- -k1,1n -k3,3 -k5,5n | sed 's/^[^-]*-//; s/|-\{1,2\}//'

./your-script represents whatever command produces the output you want to sort.

Note that an aux. character, | , is used to facilitate sorting, and the assumption is that this character doesn't appear in the input - which should be reasonable safe, given that filesystem paths usually don't contain pipe characters.

Any field 2 values (sans numeric suffix) that aren't in the list of sort keys, sort after the field 2/3 values that are, using alphabetic sorting among them.

While this does not match what the OP is looking for, it would be useful to point out that sort command has an option -V for version sorting. And it does the job by following correct order of characters in ASCII table (ie UPPERCASE letters first, lowercase letters next)

For example:

cat test.sort.txt 
/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-rp2

And sorting:

 % sort -V test.sort.txt              
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10

So, it is useful to be aware of this when giving version names.

With that said, if you insisted, this is one liner that use sed to enforce sorting:

cat test.sort.txt|sed -e 's/-rp/-x1xrp/;s/-alpha/-x2xalpha/;s/-beta-ri/-x3xbeta-ri/;s/-beta-rs/-x4xbeta-rs/;s/-RC/-x5xRC/'|sort -V|sed -e 's/x.x//'
/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10

I found out a solution totally different of what @mklement0 suggests me.

#!/bin/bash

echo "Enter a version :"
read VERSION

while read line; 
do

  find $line -type d | grep $VERSION | sort -n >> outfile.txt

  grep '.*-alpha[0-9]' outfile.txt | sort -n >> outfile2.txt 
  grep '.*-beta-ri[0-9]' outfile.txt | sort -n >> outfile2.txt 
  grep '.*-beta-rs[0-9]' outfile.txt | sort -n >> outfile2.txt 
  grep '.*-RC[0-9]' outfile.txt | sort -n >> outfile2.txt   
  rm outfile.txt 

done <whatever.txt

Content of outfile2.txt :

/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-alpha8
/foo/bar/test/example/8.2.4.0-alpha9
/foo/bar/test/example/8.2.4.0-beta-ri1
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs1
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs3
/foo/bar/test/example/8.2.4.0-RC1

The only thing wrong with this is that alpha10 came before alpha8

Any clue ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM