简体   繁体   English

如何对包含字母和数字的列表进行排序?

[英]How to sort lists that contain letters and numbers?

I have tried lots of different ways to sort the list, but it never sorts it.我尝试了很多不同的方法来对列表进行排序,但它从来没有对它进行排序。

list = ['american dad S1-EP1', 'american dad S1-EP10', 'american dad S1-EP11', 'american dad S1-EP12', 'american dad S1-EP13', 'american dad S1-EP14', 'american dad S1-EP15', 'american dad S1-EP16', 'american dad S1-EP17', 'american dad S1-EP18', 'american dad S1-EP19', 'american dad S1-EP2', 'american dad S1-EP20', 'american dad S1-EP21', 'american dad S1-EP22', 'american dad S1-EP23', 'american dad S1-EP3', 'american 
dad S1-EP4', 'american dad S1-EP5', 'american dad S1-EP6', 'american dad S1-EP7', 'american dad S1-EP8', 'american dad S1-EP9']

I want them to all be in order eg: ep1 ep2 ep3 ep4 ep5我希望它们都井井有条,例如:ep1 ep2 ep3 ep4 ep5

I suggest to use re module to extract name, episode, season etc. The key_function will sort the list by Name , Season , Episode :我建议使用re模块来提取名称、剧集、季节等key_function将按NameSeasonEpisode对列表进行排序:

import re

pat = re.compile(r"(.*) S(\d+)-EP(\d+)")


def key_function(value):
    name, season, episode = pat.search(value).groups()
    return name, int(season), int(episode)


print(sorted(lst, key=key_function))

Prints:印刷:

[
    "american dad S1-EP1",
    "american dad S1-EP2",
    "american dad S1-EP3",
    "american dad S1-EP4",
    "american dad S1-EP5",
    "american dad S1-EP6",
    "american dad S1-EP7",
    "american dad S1-EP8",
    "american dad S1-EP9",
    "american dad S1-EP10",
    "american dad S1-EP11",
    "american dad S1-EP12",
    "american dad S1-EP13",
    "american dad S1-EP14",
    "american dad S1-EP15",
    "american dad S1-EP16",
    "american dad S1-EP17",
    "american dad S1-EP18",
    "american dad S1-EP19",
    "american dad S1-EP20",
    "american dad S1-EP21",
    "american dad S1-EP22",
    "american dad S1-EP23",
]

found an answer by using:通过使用找到答案:

list.sort(key=lambda x: int("".join([i for i in x if i.isdigit()])))
  1. Create a regular expression pattern with two capturing groups - one for the season number, one for the episode number.创建一个包含两个捕获组的正则表达式模式 - 一个用于季号,一个用于剧集号。
  2. Define a custom key for the sorting function, which returns a tuple of integers.为排序 function 定义一个自定义key ,它返回一个整数元组。 The episodes will be sorted in ascending order according to these integers.剧集将根据这些整数按升序排序。

Code:代码:

import re

episodes = [
    'american dad S1-EP1',
    'american dad S1-EP10',
    'american dad S1-EP11',
    'american dad S1-EP12',
    'american dad S1-EP13',
    'american dad S1-EP14',
    'american dad S1-EP15',
    'american dad S1-EP16',
    'american dad S1-EP17',
    'american dad S1-EP18',
    'american dad S1-EP19',
    'american dad S1-EP2',
    'american dad S1-EP20',
    'american dad S1-EP21',
    'american dad S1-EP22',
    'american dad S1-EP23',
    'american dad S1-EP3',
    'american dad S1-EP4',
    'american dad S1-EP5',
    'american dad S1-EP6',
    'american dad S1-EP7',
    'american dad S1-EP8',
    'american dad S1-EP9'
]

pattern = "S(\\d+)-EP(\\d+)"

def key(episode):
    regex_match = re.search(pattern, episode)
    return tuple(map(int, regex_match.groups()))

print(sorted(episodes, key=key))

Output: Output:

['american dad S1-EP1', 'american dad S1-EP2', 'american dad S1-EP3', 'american dad S1-EP4', 'american dad S1-EP5', 'american dad S1-EP6', 'american dad S1-EP7', 'american dad S1-EP8', 'american dad S1-EP9', 'american dad S1-EP10', 'american dad S1-EP11', 'american dad S1-EP12', 'american dad S1-EP13', 'american dad S1-EP14', 'american dad S1-EP15', 'american dad S1-EP16', 'american dad S1-EP17', 'american dad S1-EP18', 'american dad S1-EP19', 'american dad S1-EP20', 'american dad S1-EP21', 'american dad S1-EP22', 'american dad S1-EP23']
>>> 

Try using the sorted function with a key:尝试使用带键的sorted function:

list1 = ['american dad S1-EP1', 'american dad S1-EP10', 'american dad S1-EP11', 'american dad S1-EP12', 'american dad S1-EP13', 'american dad S1-EP14', 'american dad S1-EP15', 'american dad S1-EP16', 'american dad S1-EP17', 'american dad S1-EP18', 'american dad S1-EP19',
        'american dad S1-EP2', 'american dad S1-EP20', 'american dad S1-EP21', 'american dad S1-EP22', 'american dad S1-EP23', 'american dad S1-EP3', 'american dad S1-EP4', 'american dad S1-EP5', 'american dad S1-EP6', 'american dad S1-EP7', 'american dad S1-EP8', 'american dad S1-EP9']

def get_last_digits(s):
    last_digits = s[s.index("P") + 1:]
    return int(last_digits)

list1.sort(key=get_last_digits)

Note: This only works if all episodes are the same season.注意:这仅适用于所有剧集都是同一季的情况。

The big question here would be whether you need to sort decimals or not.这里的大问题是您是否需要对小数进行排序。 Assuming that you only care about integers (eg that 12.6 would come before 12.56 ), then you can convert the list of strings to a list of lists, where each item in the list is either a string or an integer, then sort that:假设您只关心整数(例如12.6会出现在12.56之前),那么您可以将字符串列表转换为列表列表,其中列表中的每个项目都是字符串或 integer,然后对其进行排序:

import re

RE_NUM = re.compile(r'(\d+)|(\D+)')

def sort_mixed(strings):
    # sort list of strings with integers embedded in them
    split_strings = []
    for string in strings:
        split_string = [(int(i or 0), i or s) for i, s in RE_NUM.findall(string)]
        split_strings.append(split_string)
    return [''.join(s for _, s in v) for v in sorted(split_strings)]

# example usage
sort_mixed(['15.51', '12.9', '15.6.6', '15.6'])
# ['12.9', '15.6', '15.6.6', '15.51']

Note: Unlike other answers in this thread, the above works for any combination of integers and strings, including both no integers, no strings, or any number of integers more than one.注意:与该线程中的其他答案不同,以上内容适用于整数和字符串的任意组合,包括无整数、无字符串或大于 1 的任意数量的整数。

You can customize the sorted key by lambda .您可以通过lambda自定义排序键。 (BTW, avoid to name a variable as list in python because it's a reserved word link ) (顺便说一句,避免将变量命名为 python 中的list ,因为它是保留字链接

For more details about lambda, you can check link有关lambda的更多详细信息,您可以查看链接

Example:例子:

l = ['american dad S1-EP1', 'american dad S1-EP10', 'american dad S1-EP11', 'american dad S1-EP12', 'american dad S1-EP13', 'american dad S1-EP14', 'american dad S1-EP15', 'american dad S1-EP16', 'american dad S1-EP17', 'american dad S1-EP18', 'american dad S1-EP19', 'american dad S1-EP2', 'american dad S1-EP20', 'american dad S1-EP21', 'american dad S1-EP22', 'american dad S1-EP23', 'american dad S1-EP3', 'american dad S1-EP4', 'american dad S1-EP5', 'american dad S1-EP6', 'american dad S1-EP7', 'american dad S1-EP8', 'american dad S1-EP9']
sorted_l = sorted(l, key=lambda x: int(x.split("-EP")[1]))
print(sorted_l)

Or, python can sort one list based on values from another list (check link ).或者,python 可以根据另一个列表中的值对一个列表进行排序(查看链接)。 You can create a new list, which only contains ep number.您可以创建一个新列表,其中仅包含 ep 编号。

Example:例子:

l = ['american dad S1-EP1', 'american dad S1-EP10', 'american dad S1-EP11', 'american dad S1-EP12', 'american dad S1-EP13', 'american dad S1-EP14', 'american dad S1-EP15', 'american dad S1-EP16', 'american dad S1-EP17', 'american dad S1-EP18', 'american dad S1-EP19', 'american dad S1-EP2', 'american dad S1-EP20', 'american dad S1-EP21', 'american dad S1-EP22', 'american dad S1-EP23', 'american dad S1-EP3', 'american dad S1-EP4', 'american dad S1-EP5', 'american dad S1-EP6', 'american dad S1-EP7', 'american dad S1-EP8', 'american dad S1-EP9']
ep_list = [int(x.split("-EP")[1]) for x in l]
sorted_l = [x for _, x in sorted(zip(ep_list, l))]
print(sorted_l)

output: output:

['american dad S1-EP1', 'american dad S1-EP2', 'american dad S1-EP3', 'american dad S1-EP4', 'american dad S1-EP5', 'american dad S1-EP6', 'american dad S1-EP7', 'american dad S1-EP8', 'american dad S1-EP9', 'american dad S1-EP10', 'american dad S1-EP11', 'american dad S1-EP12', 'american dad S1-EP13', 'american dad S1-EP14', 'american dad S1-EP15', 'american dad S1-EP16', 'american dad S1-EP17', 'american dad S1-EP18', 'american dad S1-EP19', 'american dad S1-EP20', 'american dad S1-EP21', 'american dad S1-EP22', 'american dad S1-EP23']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM