[英]Sort list with numbers and letters
I am trying to sort an list which contains numbers and letters:我正在尝试对包含数字和字母的列表进行排序:
names = ["5aG", "6bG", "10cG", "J1", ...]
The output should look like this:输出应如下所示:
['5aG', '5bG', '5aR', '5bR', '6aG', '6bG', '6cG', '6aR', '6bR', '7aG', '7bG', '7aR', '8aG', '8bG', '8aR', '9aG', '9bG', '9aR','10aG', '10bG', '10cG', '10aR', 'J1', 'J2']
The first element of the string is always a number from 5 - 10, then there is a letter from a - c and in the end there is another letter ("G" or "R").字符串的第一个元素始终是 5 - 10 之间的数字,然后是 a - c 中的一个字母,最后还有另一个字母(“G”或“R”)。
Moreover there are the strings "J1" and "J2".此外还有字符串“J1”和“J2”。 They should be always the last ones ("J1" before "J2").
它们应该总是最后一个(“J1”在“J2”之前)。
How can I achieve something like that?我怎样才能实现这样的目标? I thought about using a lambda function.
我想过使用 lambda 函数。
So far I hard coded it, but I think there should be a better solution.到目前为止,我对其进行了硬编码,但我认为应该有更好的解决方案。
This is my hard coded version:这是我的硬编码版本:
classes = ['5aG', '5bG', '5aR', '5bR', '6aG', '6bG', '6cG', '6aR', '6bR', '7aG', '7bG', '7aR', '8aG', '8bG', '8aR', '9aG', '9bG', '9aR','10aG', '10bG', '10cG', '10aR', 'J1', 'J2']
def s(v):
"""Get index of element in list"""
try:
return classes.index(v)
except ValueError:
return 500
l = ['5bG', '6aG', '6bG', '8aR', '9aG', '9bG', '9aR', '10cG', '10aR', 'J1', 'J2', '5aG', '']
w = sorted( l, key=s)
print(w)
You can use re
to extract the front integer, then rely on tuple
comparison.您可以使用
re
提取前面的整数,然后依靠tuple
比较。
import re
def key(s):
num, letters = re.match(r'(\d*)(.*)', s).groups()
return float(num or 'inf'), letters
sorted_names = sorted(names, key=key)
Note how you can rely on float('inf')
to have your tokens without prefix digits pushed to the end.请注意如何依靠
float('inf')
将没有前缀数字的令牌推到最后。
You can you try this:你可以试试这个:
After scrambling your desired output:加扰所需的输出后:
import re
s = ['5aR', '7aR', '10aR', '10cG', '9bG', '8aR', '8bG', '6bR', '5aG', '9aG', 'J1', '6aR', '6aG', '5bR', '7aG', '7bG', '9aR', '5bG', 'J2', '6bG', '10bG', '8aG', '10aG', '6cG']
c, d, *h = sorted(s, key=lambda x:[False if not x[0].isdigit() else int(re.findall('^\d+', x)[0]), x[-1], x[-2]])
sorted_result = [*h, c, d]
Output:输出:
['5aG', '5bG', '5aR', '5bR', '6aG', '6bG', '6cG', '6aR', '6bR', '7aG', '7bG', '7aR', '8aG', '8bG', '8aR', '9aG', '9bG', '9aR', '10aG', '10bG', '10cG', '10aR', 'J1', 'J2']
Here is one way.这是一种方法。
lst = ['7aR', '9aG', '7bG', '10cG', '5bG', '6aG', '6bG', '10bG', 'J2', '5aR', '10aG', '9bG', '6aR', '7aG', '10aR', '9aR', '8aR', 'J1', '5bR', '6bR', '5aG', '8bG', '6cG', '8aG']
sorted([i for i in lst if i[0]!='J'], key=lambda x: [int(x[:-2]), x[-1], x[-2]]) + \
sorted(i for i in lst if i[0]=='J')
# ['5aG', '5bG', '5aR', '5bR', '6aG', '6bG', '6cG', '6aR', '6bR', '7aG', '7bG', '7aR', '8aG', '8bG', '8aR', '9aG', '9bG', '9aR', '10aG', '10bG', '10cG', '10aR', 'J1', 'J2']
The easiest way to do this is to use Python's built-in sorting functions.最简单的方法是使用 Python 的内置排序函数。 By providing a suitable function as the
key
argument you can sort things in just about any order you choose.通过提供合适的函数作为
key
参数,您可以按您选择的任何顺序对事物进行排序。
Internally, when you provide a key function the sort generates a list of two-element tuples.在内部,当您提供键函数时,排序会生成一个包含两个元素的元组列表。 The first element of the tuple is a sort key, the result of applying the key function to the second element, the value from the list.
元组的第一个元素是排序键,是将键函数应用于第二个元素的结果,即列表中的值。 It then sorts those tuples, and returns a list of the second elements.
然后对这些元组进行排序,并返回第二个元素的列表。 This is known as decorate-sort-undecorate.
这称为装饰-排序-取消装饰。
Most of your strings are an integer followed by two letters.大多数字符串都是一个整数,后跟两个字母。 The remainder, which you wish to appear last, are either
"J1"
or "J2"
.您希望最后出现的其余部分是
"J1"
或"J2"
。 The following should be a suitable key function.以下应该是一个合适的按键功能。 I take the precaution of applying the
int
function to the numbers, to ensure that they sort numerically rather than lexicographically (because '2' > '10'
).我采取了将
int
函数应用于数字的预防措施,以确保它们按数字而不是按字典顺序排序(因为'2' > '10'
)。
def key_func(s):
# Ensure J-strings are at the end
if s.startswith('J'):
return (1000000, 'J', int(s[1:]))
else:
# The rest, split into digits and two characters
return (int(s[:-2]), s[-2], s[-1])
When tested with a randomised copy of your data the result of使用随机数据副本进行测试时,结果为
data = ['8aG', '5aR', '6aG', '10aG', '6cG', '8bG', '9aG',
'5aG', '6bG', '7aR', 'J1', '10cG', '10bG', '10aR',
'6bR', 'J2', '6aR', '8aR', '7aG', '9aR', '5bR',
'9bG', '7bG', '5bG']
print(sorted(data, key=key_func))
appears to be correct (line breaks inserted for readability):似乎是正确的(为了可读性插入换行符):
['5aG', '5aR', '5bG', '5bR', '6aG', '6aR', '6bG', '6bR',
'6cG', '7aG', '7aR', '7bG', '8aG', '8aR', '8bG', '9aG',
'9aR', '9bG', '10aG', '10aR', '10bG', '10cG', 'J1', 'J2']
With custom compound_sort()
function:使用自定义
compound_sort()
函数:
import re
lst = ['9bG', '9aR', 'J2', '7bG', '7aG', '6bR', 'J1', '6cG', '6aG', '6bG', '5bG', '5aG', '8bG', '5bR', '8aR', '5aR', '10aR', '6aR', '10bG', '10aG', '9aG', '10cG', '7aR', '8aG']
pat = re.compile(r'(\d+)(.*)|(J)(\d+)')
def compound_sort(t):
t = tuple(filter(None, t)) # filter empty(None) matches
return (int(t[0]),) + t[1:] if t[0] != 'J' else (float('inf'), t[1])
result = sorted(lst, key=lambda x: compound_sort(pat.search(x).groups()))
print(result)
The output:输出:
['5aG', '5aR', '5bG', '5bR', '6aG', '6aR', '6bG', '6bR', '6cG', '7aG', '7aR', '7bG', '8aG', '8aR', '8bG', '9aG', '9aR', '9bG', '10aG', '10aR', '10bG', '10cG', 'J1', 'J2']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.