繁体   English   中英

以数字方式对字符串列表进行排序并过滤重复项?

[英]Sort list of strings numerically and filter duplicates?

给定以下格式的字符串列表:

[
    "464782,-100,4,3,1,100,0,0"
    "465042,-166.666666666667,4,3,1,100,0,0",
    "465825,-250.000000000001,4,3,1,100,0,0",
    "466868,-166.666666666667,4,3,1,100,0,0",
    "467390,-200.000000000001,4,3,1,100,0,0",
    "469999,-100,4,3,1,100,0,0",
    "470260,-166.666666666667,4,3,1,100,0,0",
    "474173,-100,4,3,1,100,0,0",
    "474434,-166.666666666667,4,3,1,100,0,0",
    "481477,-100,4,3,1,100,0,1",
    "531564,259.011439671919,4,3,1,60,1,0",
    "24369,-333.333333333335,4,3,1,100,0,0",
    "21082,410.958904109589,4,3,1,60,1,0",
    "21082,-250,4,3,1,100,0,0",
    "22725,-142.857142857143,4,3,1,100,0,0",
    "23547,-166.666666666667,4,3,1,100,0,0",
    "24369,-333.333333333335,4,3,1,100,0,0",
    "27657,-200.000000000001,4,3,1,100,0,0",
    "29301,-142.857142857143,4,3,1,100,0,0",
    "30123,-166.666666666667,4,3,1,100,0,0",
    "30945,-250,4,3,1,100,0,0",
    "32588,-166.666666666667,4,3,1,100,0,0",
    "34232,-250,4,3,1,100,0,0",
    "35876,-142.857142857143,4,3,1,100,0,0",
    "36698,-166.666666666667,4,3,1,100,0,0",
    "37520,-250,4,3,1,100,0,0",
    "42451,-142.857142857143,4,3,1,100,0,0",
    "43273,-166.666666666667,4,3,1,100,0,0",
]

如何使用python根据每行中的第一个数字对列表进行排序? 然后,一旦排序,删除所有重复项,如果有的话?

列表的排序标准是每行第一个逗号之前的数字,它始终是一个整数。

我尝试使用 list.sort() ,但是,这会按词法顺序而不是数字顺序对项目进行排序。

您可以为此使用字典。 键将是第一个逗号之前的数字和整个字符串的值。 重复项将被消除,但只存储特定数字字符串的最后一次出现。

l = ['464782,-100,4,3,1,100,0,0',
'465042,-166.666666666667,4,3,1,100,0,0',
'465825,-250.000000000001,4,3,1,100,0,0',
'466868,-166.666666666667,4,3,1,100,0,0',
'467390,-200.000000000001,4,3,1,100,0,0',
...]

d = {int(s.split(',')[0]) : s for s in l}
result = [d[key] for key in sorted(d.keys())]

我会尝试这两种方法之一:

def sort_list(lis):
    nums = [int(num) if isdigit(num) else float(num) for num in lis]

    nums = list(set(nums))
    nums.sort()

    return [str(i) for i in nums]  # I assumed you wanted them to be strings.

如果lis中的所有项都不是intsfloats或数字的字符串表示形式,则第一个将引发TypeError 第二种方法没有这个问题,但有点奇怪。

def sort_list(lis):
    ints = [int(num) for num in lis if num.isdigit()]
    floats = [float(num) for num in lis if not num.isdigit()]

    nums = ints.copy()
    nums.extend(floats)
    nums = list(set(nums))
    nums.sort()

    return [str(i) for i in nums]  # I assumed you wanted them to be strings.

希望这可以帮助。

你可以试试这个。

首先,我们需要使用 set() 删除列表中的重复项

removed_duplicates_list = list(set(listr))

然后我们将字符串列表转换为元组列表

list_of_tuples = [tuple(i.split(",")) for i in removed_duplicates_list]

然后我们使用 sort() 对其进行排序

list_of_tuples.sort()

完整的代码示例如下:

listr = [
    "464782,-100,4,3,1,100,0,0"
    "465042,-166.666666666667,4,3,1,100,0,0",
    "465825,-250.000000000001,4,3,1,100,0,0",
    "466868,-166.666666666667,4,3,1,100,0,0",
    "467390,-200.000000000001,4,3,1,100,0,0",
    "469999,-100,4,3,1,100,0,0",
    "470260,-166.666666666667,4,3,1,100,0,0",
    "474173,-100,4,3,1,100,0,0",
    "474434,-166.666666666667,4,3,1,100,0,0",
    "481477,-100,4,3,1,100,0,1",
    "531564,259.011439671919,4,3,1,60,1,0",
    "24369,-333.333333333335,4,3,1,100,0,0",
    "21082,410.958904109589,4,3,1,60,1,0",
    "21082,-250,4,3,1,100,0,0",
    "22725,-142.857142857143,4,3,1,100,0,0",
    "23547,-166.666666666667,4,3,1,100,0,0",
    "24369,-333.333333333335,4,3,1,100,0,0",
    "27657,-200.000000000001,4,3,1,100,0,0",
    "29301,-142.857142857143,4,3,1,100,0,0",
    "30123,-166.666666666667,4,3,1,100,0,0",
    "30945,-250,4,3,1,100,0,0",
    "32588,-166.666666666667,4,3,1,100,0,0",
    "34232,-250,4,3,1,100,0,0",
    "35876,-142.857142857143,4,3,1,100,0,0",
    "36698,-166.666666666667,4,3,1,100,0,0",
    "37520,-250,4,3,1,100,0,0",
    "42451,-142.857142857143,4,3,1,100,0,0",
    "43273,-166.666666666667,4,3,1,100,0,0",
]

removed_duplicates_list = list(set(listr))
list_of_tuples = [tuple(i.split(",")) for i in removed_duplicates_list]
list_of_tuples.sort()
print(list_of_tuples) # the output is a list of tuples

输出:

    [('21082', '-250', '4', '3', '1', '100', '0', '0'),
    ('21082', '410.958904109589', '4', '3', '1', '60', '1', '0'),
    ('22725', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
    ('23547', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('24369', '-333.333333333335', '4', '3', '1', '100', '0', '0'),
    ('27657', '-200.000000000001', '4', '3', '1', '100', '0', '0'),
    ('29301', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
    ('30123', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('30945', '-250', '4', '3', '1', '100', '0', '0'),
    ('32588', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('34232', '-250', '4', '3', '1', '100', '0', '0'),
    ('35876', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
    ('36698', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('37520', '-250', '4', '3', '1', '100', '0', '0'),
    ('42451', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
    ('43273', '-166.666666666667', '4', '3', '1', '100', '0', '0'),  
    ('464782','-100','4','3','1','100','0'),
    ('465042','-166.666666666667','4','3','1','100','0','0'),
    ('465825', '-250.000000000001', '4', '3', '1', '100', '0', '0'),
    ('466868', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('467390', '-200.000000000001', '4', '3', '1', '100', '0', '0'),
    ('469999', '-100', '4', '3', '1', '100', '0', '0'),
    ('470260', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('474173', '-100', '4', '3', '1', '100', '0', '0'),
    ('474434', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
    ('481477', '-100', '4', '3', '1', '100', '0', '1'),
    ('531564', '259.011439671919', '4', '3', '1', '60', '1', '0')]

我希望这会有所帮助。 我将所有列表元素放在一个名为lista.txt的单独文件中您是否需要从列表中一一获取所有元素( while函数或for函数)并通过检查新项目是否已存在将它们添加到临时列表中,如果存在则通过然后您可以使用.sort()因为会用数字来解决问题。

# Global variables
file = "lista.txt"
tempList = []

# Logic get items from file
def GetListFromFile(fileName):
    # Local variables
    showDoneMsg = True

    # Try to run this code
    try:
        # Open file and try to read it
        with open(fileName, mode="r") as f:
            # Define line
            line = f.readline()
            # For every line in file
            while line:
                # Get out all end white space (\n, \r)
                item = line.rstrip()

                # Check if this item is not allready in the list
                if item not in tempList:
                    # Append item to a temporar list
                    tempList.append(item)
                # Show me if a itmes allready exist
                else:
                    print("Dublicate >>", item)

                # Go to new line
                line = f.readline()
        # This is optional because is callet automatical
        # but I like to be shore
        f.close()

    # Execptions
    except FileNotFoundError:
        print("ERROR >> File do not exist!")
        showDoneMsg = False

    # Sort the list
    tempList.sort()
    # Show me when is done if file exist
    if showDoneMsg == True:
        print("\n>>> DONE <<<\n")

# Logic show list items
def ShowListItems(thisList):
    if len(thisList) == 0:
        print("Temporary list is empty...")
    else:
        print("This is new items list:")
        for i in thisList:
            print(i)

# Execute function
GetListFromFile(file)
# Testing if items was sorted
ShowListItems(tempList)

输出:

========================= RESTART: D:\Python\StackOverflow\help.py =========================
Dublicate >> 43273,-166.666666666667,4,3,1,100,0,0

>>> DONE <<<

21082,-250,4,3,1,100,0,0
21082,410.958904109589,4,3,1,60,1,0
22725,-142.857142857143,4,3,1,100,0,0
...
474434,-166.666666666667,4,3,1,100,0,0
481477,-100,4,3,1,100,0,1
531564,259.011439671919,4,3,1,60,1,0
>>> 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM