简体   繁体   English

如何从 Bash 中对包含公共前缀和后缀的字符串进行数字排序?

[英]How to sort strings that contain a common prefix and suffix numerically from Bash?

Here is a list of files:以下是文件列表:

some.string_100_with_numbers.in-it.txt
some.string_101_with_numbers.in-it.txt
some.string_102_with_numbers.in-it.txt
some.string_23_with_numbers.in-it.txt
some.string_24_with_numbers.in-it.txt
some.string_25_with_numbers.in-it.txt

Now I would like to sort it numerically.现在我想对它进行数字排序。 Starting with *23* and ending with *102*.以 *23* 开头,以 *102* 结尾。

I have tried -n and -g .我试过-n-g -t does not help in these messy strings. -t对这些凌乱的字符串没有帮助。

Can I ignore leading strings to the number with an option or do I have to be clever and script?我可以通过选项忽略数字的前导字符串还是必须聪明和脚本?

Use ls -lv 使用ls -lv

From the man page: 从手册页:

-v     natural sort of (version) numbers within text

Try the following: 请尝试以下方法:

sort -t '_' -k 2n
  • -t '_' (sets the delimiter to the underscore character) -t '_' (将分隔符设置为下划线字符)
  • -k 2n (sorts by the second column using numeric ordering) -k 2n (使用数字排序按第二列排序)

DEMO . 演示

In the general case, try the Schwartzian transform . 在一般情况下,尝试Schwartzian变换

Briefly, break out the number into its own field, sort on that, and discard the added field. 简而言之,将数字分解到自己的字段中,对其进行排序,并丢弃添加的字段。

# In many shells, use ctrl-v tab to insert a literal tab after the first \2
sed 's/^\([^0-9]*\)\([0-9][0-9]*\)/\2   \1\2/' file |
sort -n |
cut -f2-

This works nicely if the input doesn't have an obvious separator, like for the following input. 如果输入没有明显的分隔符,就像下面的输入一样,这很好用。

abc1
abc10
abc2

where you would like the sort to move the last line up right after the first. 你希望排序在第一行之后的最后一行。

If available, simply use sort -V . 如果可用,只需使用sort -V This is a sort for version numbers, but works well as a "natural sort" option. 这是版本号的排序,但可以作为“自然排序”选项。

$ ff=$( echo some.string_{100,101,102,23,24,25}_with_numbers.in-it.txt )

Without sort: 没有排序:

$ for f in $ff ; do echo $f ; done
some.string_100_with_numbers.in-it.txt
some.string_101_with_numbers.in-it.txt
some.string_102_with_numbers.in-it.txt
some.string_23_with_numbers.in-it.txt
some.string_24_with_numbers.in-it.txt
some.string_25_with_numbers.in-it.txt

With sort -V: 排序-V:

$ for f in $ff ; do echo $f ; done | sort -V
some.string_23_with_numbers.in-it.txt
some.string_24_with_numbers.in-it.txt
some.string_25_with_numbers.in-it.txt
some.string_100_with_numbers.in-it.txt
some.string_101_with_numbers.in-it.txt
some.string_102_with_numbers.in-it.txt

Not direct, but you can rename files moving the number part (with additional padded zeros as needed) as prefix.不是直接的,但您可以将移动数字部分的文件(根据需要使用额外的填充零)重命名为前缀。

Ex: 1.txt, 23.txt, 2.txt rename to 01.txt, 23.txt, 02.txt例如:1.txt、23.txt、2.txt 重命名为 01.txt、23.txt、02.txt

now ls default output is numerical order 01.txt 02.txt 23.txt现在 ls 默认 output 是数字顺序 01.txt 02.txt 23.txt

A simple python script might help you.一个简单的 python 脚本可能会对您有所帮助。 You can even use regex to do it in one shot, if you prefer.如果您愿意,您甚至可以使用正则表达式一次性完成。

import glob
files = glob.glob('*[0-9]*.py')
print(files)
import os
for f in files:
    # 64.smallest_word_window.py
    parts = f.split('_')
    number = str('%04d' % (int(parts[0])))+'_'
    del parts[0]
    new_parts = number + '_'.join( parts )
    print(f'{new_parts}')
    os.rename(f, new_parts)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM