简体   繁体   中英

How to sort strings that contain a common prefix and suffix numerically from Bash?

Here is a list of files:

some.string_100_with_numbers.in-it.txt
some.string_101_with_numbers.in-it.txt
some.string_102_with_numbers.in-it.txt
some.string_23_with_numbers.in-it.txt
some.string_24_with_numbers.in-it.txt
some.string_25_with_numbers.in-it.txt

Now I would like to sort it numerically. Starting with *23* and ending with *102*.

I have tried -n and -g . -t does not help in these messy strings.

Can I ignore leading strings to the number with an option or do I have to be clever and script?

Use ls -lv

From the man page:

-v     natural sort of (version) numbers within text

Try the following:

sort -t '_' -k 2n
  • -t '_' (sets the delimiter to the underscore character)
  • -k 2n (sorts by the second column using numeric ordering)

DEMO .

In the general case, try the Schwartzian transform .

Briefly, break out the number into its own field, sort on that, and discard the added field.

# In many shells, use ctrl-v tab to insert a literal tab after the first \2
sed 's/^\([^0-9]*\)\([0-9][0-9]*\)/\2   \1\2/' file |
sort -n |
cut -f2-

This works nicely if the input doesn't have an obvious separator, like for the following input.

abc1
abc10
abc2

where you would like the sort to move the last line up right after the first.

If available, simply use sort -V . This is a sort for version numbers, but works well as a "natural sort" option.

$ ff=$( echo some.string_{100,101,102,23,24,25}_with_numbers.in-it.txt )

Without sort:

$ for f in $ff ; do echo $f ; done
some.string_100_with_numbers.in-it.txt
some.string_101_with_numbers.in-it.txt
some.string_102_with_numbers.in-it.txt
some.string_23_with_numbers.in-it.txt
some.string_24_with_numbers.in-it.txt
some.string_25_with_numbers.in-it.txt

With sort -V:

$ for f in $ff ; do echo $f ; done | sort -V
some.string_23_with_numbers.in-it.txt
some.string_24_with_numbers.in-it.txt
some.string_25_with_numbers.in-it.txt
some.string_100_with_numbers.in-it.txt
some.string_101_with_numbers.in-it.txt
some.string_102_with_numbers.in-it.txt

Not direct, but you can rename files moving the number part (with additional padded zeros as needed) as prefix.

Ex: 1.txt, 23.txt, 2.txt rename to 01.txt, 23.txt, 02.txt

now ls default output is numerical order 01.txt 02.txt 23.txt

A simple python script might help you. You can even use regex to do it in one shot, if you prefer.

import glob
files = glob.glob('*[0-9]*.py')
print(files)
import os
for f in files:
    # 64.smallest_word_window.py
    parts = f.split('_')
    number = str('%04d' % (int(parts[0])))+'_'
    del parts[0]
    new_parts = number + '_'.join( parts )
    print(f'{new_parts}')
    os.rename(f, new_parts)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM