简体   繁体   English

Python3中列表的自然排序

[英]Natural sorting of a list in Python3

I am trying to sort a list:我正在尝试对列表进行排序:

[
    '[fc] EDW Ratio (10 degrees)', 
    ' [fc] EDW Ratio (45 degrees)', 
    ' [fc] EDW Ratio (60 degrees)', 
    ' [fc] EDW Ratio (25 degrees)', 
    ' [fc] EDW Ratio (20 degrees)', 
    ' [fc] EDW Ratio (30 degrees)', 
    ' [fc] EDW Ratio (15 degrees)', 
    ' [fc] EDW output factor (60 degrees)', 
    ' [fc] Quality index'
]

using the first part of the accepted answer here :此处使用已接受答案的第一部分:

But the list is ending up like this:但是这个列表最终是这样的:

[
    ' [fc] EDW Ratio (15 degrees)', 
    ' [fc] EDW Ratio (20 degrees)', 
    ' [fc] EDW Ratio (25 degrees)', 
    ' [fc] EDW Ratio (30 degrees)', 
    ' [fc] EDW Ratio (45 degrees)', 
    ' [fc] EDW Ratio (60 degrees)', 
    ' [fc] EDW output factor (60 degrees)', 
    ' [fc] Quality index', 
    '[fc] EDW Ratio (10 degrees)'
]

whereas I want EDW Ratio (10 degrees) to end up at the start of the list after sorting (index position 0).而我希望EDW 比率(10 度)在排序后位于列表的开头(索引 position 0)。

How can this be done?如何才能做到这一点?

My code includes the following:我的代码包括以下内容:

#
# Method to define natural sorting used to sort lists
#
def atoi(text):
    return int(text) if text.isdigit() else text

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    '''
    return [ atoi(c) for c in re.split(r'(\d+)', text) ]

    .
    .
    .


    tname_list = test_names.split(",") # this outputs the exact first (unsorted) list shown above

    tname_list.sort(key=natural_keys) # use human sorting defined above. This outputs the second list shown above.

Your code is correct, but your data look incorrect: all the entries have a leading whitespace, which implies they are "before" the one you identify as least, that actually have no leading whitespace.您的代码是正确的,但您的数据看起来不正确:所有条目都有一个前导空格,这意味着它们“在”您至少识别的那个之前,实际上没有前导空格。

If the data is fine as they are I suggest you to revise the code to ignore leading whitespaces (check this: How do I remove leading whitespace in Python? ).如果数据正常,我建议您修改代码以忽略前导空格(检查此: 如何删除 Python 中的前导空格? )。

You need to modify natural_keys to only return the numerical part of the string as an int .您需要修改natural_keys以仅将字符串的数字部分作为int返回。 You should use int() for the conversion instead of atoi() which returns the ascii code of a character.您应该使用int()而不是atoi()进行转换,后者返回字符的 ascii 代码。

You're going to run into trouble if any of your strings contain more than one number, or put the numbers at the beginning or end of the string.如果您的任何字符串包含多个数字,或者将数字放在字符串的开头或结尾,您将遇到麻烦。 That's because Python can't compare an int and str to each other.那是因为 Python 无法将intstr相互比较。 Your key function should return both as a tuple or list.您的密钥 function 应该以元组或列表的形式返回。

def atoi(text):
    return (int(text), '') if text.isdigit() else (math.nan, text)

math.nan is special, because it will never compare less than an actual number. math.nan很特别,因为它永远不会比实际数字少。

I recommend using natsort (full disclosure, I am the author).我推荐使用natsort (完全公开,我是作者)。 Your data is also a bit messy, you need to remove the leading whitespace to normalize all the entries.您的数据也有点混乱,您需要删除前导空格以规范所有条目。

from natsort import natsorted
data = [
    '[fc] EDW Ratio (10 degrees)', 
    ' [fc] EDW Ratio (45 degrees)', 
    ' [fc] EDW Ratio (60 degrees)', 
    ' [fc] EDW Ratio (25 degrees)', 
    ' [fc] EDW Ratio (20 degrees)', 
    ' [fc] EDW Ratio (30 degrees)', 
    ' [fc] EDW Ratio (15 degrees)', 
    ' [fc] EDW output factor (60 degrees)', 
    ' [fc] Quality index'
]
data_sorted = natsorted(data, key=lambda x: x.lstrip())

Outputs输出

[
 '[fc] EDW Ratio (10 degrees)',
 ' [fc] EDW Ratio (15 degrees)',
 ' [fc] EDW Ratio (20 degrees)',
 ' [fc] EDW Ratio (25 degrees)',
 ' [fc] EDW Ratio (30 degrees)',
 ' [fc] EDW Ratio (45 degrees)',
 ' [fc] EDW Ratio (60 degrees)',
 ' [fc] EDW output factor (60 degrees)',
 ' [fc] Quality index',
]
import re

def get_numbers(texto):
    return int(re.findall(r'[0-9]+', texto)[0])
        
def sort_list(l):
    dicto = {}
    for i in l:
        dicto[get_numbers(i)] = i
    lista = []
    for i in sorted(list(dicto.keys())):
        lista.append(dicto[i])
    return lista

sort_list(frames)

NOTE that it will only work for the first serie of numbers... "peter123jjj111" will only take 123 into account请注意,它仅适用于第一组数字...“peter123jjj111”将仅考虑 123

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM