简体   繁体   中英

How do I sort alphanumeric strings in Bash

I want to sort a list of files based on file name.

Input

280900_b24.txt
280900_b23.txt
280900_b25.txt
280900_b28.txt
280900.txt
280900_b27.txt
280900_b22.txt
280900_b30.txt
280900_b29.txt
280902.txt
280902_b01.txt
280901_b08.txt
280901.txt
280900_b26.txt

Expected output

280902_b01.txt
280902.txt
280901_b08.txt
280901.txt
280900_b30.txt
280900_b29.txt
280900_b28.txt
280900_b27.txt
280900_b26.txt
280900_b25.txt
280900_b24.txt
280900_b23.txt
280900_b22.txt
280900.txt

The closest I can get is sort -r

280902.txt
280902_b01.txt
280901.txt
280901_b08.txt
280900.txt
280900_b30.txt
280900_b29.txt
280900_b28.txt
280900_b27.txt
280900_b26.txt
280900_b25.txt
280900_b24.txt
280900_b23.txt
280900_b22.txt

but I want files with _b# to come before files without the _b# in the name. example: I want 280902_b01.txt to come before 280902.txt.

I cannot test it but I believe you can do

 sort -k1.1,1.6r -k1.8,1.8 -k1.9r

This, however, will give problems with

 280900.txt
 280900_b30.txt
 280900_s30.txt

So it might be better to do

 sort -k1.1,1.6r -k1.7,1.7 -k1.8r

The latter is better as it reverse-sorts on the first 6 characters followed by a normal sort on the 7th character in case of clashes in the first. This solves the underscore-dot problem. Finally, we reverse sort the remainder.

It looks like you want a reverse sort on the first, numeric, part, . and _ to sort the same and a forward (non-reverse) sort on everything after that. This does what you say you want when I try it with your data:

sort -k1.1,1.6r -k1.8,1.14 input.txt

This does the reverse sort on columns 1-6, ignores column 7 and a forward sort on columns 8 to 14.

You can do:

$ echo "280900_b24.txt
280900_b23.txt
280900_b25.txt
280900_b28.txt
280900.txt
280900_b27.txt
280900_b22.txt
280900_b30.txt
280900_b29.txt
280902.txt
280902_b01.txt
280901_b08.txt
280901.txt
280900_b26.txt" | sort -t _ -k1r
280902_b01.txt
280902.txt
280901_b08.txt
280901.txt
280900_b30.txt
280900_b29.txt
280900_b28.txt
280900_b27.txt
280900_b26.txt
280900_b25.txt
280900_b24.txt
280900_b23.txt
280900_b22.txt
280900.txt

Explanation:

sort -t _ -k1rn
      ^                 split
        ^               on the underscore
           ^            sort on field 1 in reverse order
                        the r is applied to the rest of the fields as well
                        after the first
               ^        numeric for the first field, 'ascii' for the rest

Just to show that -r applies to the rest of the fields, consider:

$ echo {9..11}_{9..11}.txt | tr ' ' '\n' 
9_9.txt
9_10.txt
9_11.txt
10_9.txt
10_10.txt
10_11.txt
11_9.txt
11_10.txt
11_11.txt

If you sort that in the same fashion:

$ echo {9..11}_{9..11}.txt | tr ' ' '\n' | sort -t _ -k1rn
11_10.txt
11_11.txt
11_9.txt
10_10.txt
10_11.txt
10_9.txt
9_10.txt
9_11.txt
9_9.txt

The remainder fields are considered asciibetical. If you want numeric on the remainder fields:

$ echo {9..11}_{9..11}.txt | tr ' ' '\n' | sort -t _ -k1rn -k2rn
11_11.txt
11_10.txt
11_9.txt
10_11.txt
10_10.txt
10_9.txt
9_11.txt
9_10.txt
9_9.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM