简体   繁体   English

如何根据子字符串对python字符串列表进行排序

[英]How to sort a python list of strings based on a substring

I am trying to sort a python list using sorted method as per the code below.我正在尝试按照下面的代码使用 sorted 方法对 python 列表进行排序。 However the sorting is not happening properly.但是,排序没有正确进行。

#sort using the number part of the string
mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
def func(elem):
    return elem.split('-')[1].split('.')[0]

sortlist = sorted(mylist,key=func)
for i in sortlist:
  print(i)

The output is-
XYZ-18.txt
XYZ-78.txt
XYZ-8.txt

I was expecting output as- 
XYZ-8.txt
XYZ-18.txt
XYZ-78.txt

you should transform the numbers in Integers你应该转换整数中的数字

#sort using the number part of the string
mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
def func(elem):
    return int(elem.split('-')[1].split('.')[0])

sortlist = sorted(mylist,key=func)
for i in sortlist:
  print(i)

what you see is the ordering based on the ASCII's value's cipher你看到的是基于 ASCII 值的密码的排序

encapsulate the variable with int.用 int 封装变量。

Ex:前任:

mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
print(sorted(mylist, key=lambda x: int(x.split("-")[-1].split(".")[0])))

Output:输出:

['XYZ-8.txt', 'XYZ-18.txt', 'XYZ-78.txt']

With str methods:使用str方法:

mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt']
result = sorted(mylist, key=lambda x: int(x[x.index('-')+1:].replace('.txt', '')))

print(result)

The output:输出:

['XYZ-8.txt', 'XYZ-18.txt', 'XYZ-78.txt']

Use this code for sorting the list of strings numerically (which is needed) instead of sorting it in lexographically (which is taking place in the given code).使用此代码对字符串列表进行数字排序(这是必需的),而不是按字典顺序对其进行排序(在给定代码中进行)。

#sort using the number part of the string
mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
def func(elem):
    return elem[elem.index('-')+1:len(elem)-5]
sortlist = sorted(mylist,key=func)
for i in sortlist: 
    print(i) 

There is a generic approach to this problem called human readable sort or with the more popular name alphanum sort which basically sort things in a way humans expect it to appear.这个问题有一个通用的方法,称为human readable sort或者更流行的名称alphanum sort ,它基本上以人类期望它出现的方式对事物进行排序。

import re
mylist = ['XYZ78.txt', 'XYZ8.txt', 'XYZ18.txt'] 

def tryint(s):
    try:
        return int(s)
    except:
        return s

def alphanum_key(s):
    """ Turn a string into a list of string and number chunks.
        "z23a" -> ["z", 23, "a"]
    """
    return [ tryint(c) for c in re.split('([0-9]+)', s) ]

def sort_nicely(l):
    """ Sort the given list in the way that humans expect.
    """

l.sort(key=alphanum_key)
['XYZ-8.txt', 'XYZ-18.txt', 'XYZ-78.txt']

That will work on any string, don't have to split and cut chars to extract a sort-able field.这将适用于任何字符串,不必拆分剪切字符来提取可排序的字段。

Good read about alphanum: http://www.davekoelle.com/alphanum.html关于 alphanum 的好书: http : //www.davekoelle.com/alphanum.html

Original Source code: https://nedbatchelder.com/blog/200712/human_sorting.html原始源码: https : //nedbatchelder.com/blog/200712/human_sorting.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM