简体   繁体   English

通过按层次结构顺序递减百分比对元组列表进行排序

[英]Sort a list of tuples by descending percentage with hierarchical order

I'm trying to sort an ingredient statement with multiple levels in order of dominance (descending by percentage). 我正在尝试按主导地位的顺序(按百分比降序)对具有多个级别的成分声明进行排序。

I'm using Python and I have a list of tuples, each tuple has the following variables: (ingredient, percentage, childID, parentID). 我正在使用Python,并且有一个元组列表,每个元组都有以下变量:(成分,百分比,childID,parentID)。

It comes from data that looks sort of like this, the data can be entered in any order. 它来自看起来像这样的数据,可以按任何顺序输入数据。 The columns below are Ingredients/Subingredients, percentages, childID, parentID. 下面的列是成分/子成分,百分比,childID,parentID。

#Ing1   30%             1   0
#---Sub1    30%         2   1
#---Sub2    60%         3   1
#------Sub3     15%     4   3
#------Sub4     85%     5   3
#---Sub5    10%         6   1
#Ing2   10%             7   0
#Ing3   60%             5   0

My existing code outputs this to me in a list that like this (the order it's entered): 我现有的代码在一个类似于以下列表(输入顺序)的列表中将其输出给我:

list = [(Ing1,30,1,0),(Sub1,30,2,1),(Sub2,60,3,1),(Sub3,15,4,3),(Sub4,85,5,3),(Sub5,10,6,1),(Ing2,10,7,0),(Ing3,60,5,0)]

What I need to do is sort this list descending buy percentage while keeping the hierarchy intact from the lower levels up. 我需要做的是对列表中的购买百分比进行降序排序,同时保持层次结构从低到高的完整性。 So the level 3 ingredients (Sub3, Sub4) first, then the next level up, then the top level. 因此,首先是第3级成分(Sub3,Sub4),然后是下一级,然后是顶级。
The sub levels need to sort with their parent. 子级别需要与其父级进行排序。

So, for the example above, I need output to be in this order: 因此,对于上面的示例,我需要按照以下顺序进行输出:

> #Ing3 60%             5   0
> #Ing1 30%             1   0
> #---Sub2  60%         3   1
> #------Sub4   85%     5   3
> #------Sub3   15%     4   3
> #---Sub1  30%         2   1
> #---Sub5  10%         6   1
> #Ing2 10%             7   0

So the list should look like this: 因此,列表应如下所示:

list = [(Ing3,60,5,0),(Ing1,30,1,0),(Sub2,60,3,1),(Sub4,85,5,3),(Sub3,15,4,3),(Sub1,30,2,1),(Sub5,10,6,1),(Ing2,10,7,0)]

What's the most elegant way to do this in Python. 用Python做到这一点的最优雅的方法是什么。 Oh and another caveat as I'm limited as to what modules I can import. 哦,还有另一个警告,因为我只能导入哪些模块。 If it's not an included module I probably don't have access to it due to my environment. 如果它不是一个包含的模块,由于我的环境,我可能无法访问它。

You could use a generator like this: 您可以使用这样的生成器:

lst = [('Ing1',30,1,0),
       ('Sub1',30,2,1),
       ('Sub2',60,3,1),
       ('Sub3',15,4,3),
       ('Sub4',85,5,3),
       ('Sub5',10,6,1),
       ('Ing2',10,7,0),
       ('Ing3',60,5,0)]

def sort_hierarchical(lst, parent=0):
    # sort the current layer (excluding all other elements) by the second element
    res = sorted([i for i in lst if i[3] == parent], key=lambda x: x[1], reverse=True)
    for item in res:
        yield item
        # recurse for all childs of this item
        for subitem in sort_hierarchical(lst, parent=item[2]):
            yield subitem

>>> list(sort_hierarchical(lst))
[('Ing3', 60, 5, 0),
 ('Ing1', 30, 1, 0),
 ('Sub2', 60, 3, 1),
 ('Sub4', 85, 5, 3),
 ('Sub3', 15, 4, 3),
 ('Sub1', 30, 2, 1),
 ('Sub5', 10, 6, 1),
 ('Ing2', 10, 7, 0)]

It could be even simplified further if you sort the list just once before you pass it to the function. 如果仅在将列表传递给函数之前对列表进行一次排序,则甚至可以进一步简化。 Then you only have to filter the items not sort them multiple times: 然后,您只需要过滤项目,而无需多次对其进行排序:

def return_hierarchical(lst, parent=0):
    for item in (i for i in lst if i[3] == parent):
        yield item
        for subitem in return_hierarchical(lst, parent=item[2]):
            yield subitem

>>> list(return_hierarchical(sorted(lst, key=lambda x: x[1], reverse=True)))
[('Ing3', 60, 5, 0),
 ('Ing1', 30, 1, 0),
 ('Sub2', 60, 3, 1),
 ('Sub4', 85, 5, 3),
 ('Sub3', 15, 4, 3),
 ('Sub1', 30, 2, 1),
 ('Sub5', 10, 6, 1),
 ('Ing2', 10, 7, 0)]

In Python-3.3+ you can use yield from and make it even shorter: 在Python-3.3 +中,您可以使用yield from并将其缩短:

def return_hierarchical(lst, parent=0):
    for item in (i for i in lst if i[3] == parent):
        yield item
        yield from return_hierarchical(lst, parent=item[2])

General notes: 一般注意事项:

I renamed your list to lst so it doesn't shadow the built-in list . 我将您的list重命名为lst因此它不会lst内置list

You're dealing with tuples but you refer to them by names so you could also use collections.namedtuple . 您正在处理元组,但是您通过名称引用它们,因此也可以使用collections.namedtuple This allows you to refer to the items by attribute as well: 这还允许您通过属性引用项目:

from collections import namedtuple

ingredient = namedtuple('Ingredient', ['ingredient', 'percentage', 'order', 'parent'])

lst = [ingredient('Ing1',30,1,0), ingredient('Sub1',30,2,1), ingredient('Sub2',60,3,1),
       ingredient('Sub3',15,4,3), ingredient('Sub4',85,5,3), ingredient('Sub5',10,6,1),
       ingredient('Ing2',10,7,0), ingredient('Ing3',60,5,0)]

def return_hierarchical(lst, parent=0):
    for item in (i for i in lst if i.parent == parent):
        yield item
        yield from return_hierarchical(lst, parent=item.parent)

list(sort_hierarchical(sorted(lst, key=lambda x: x.percentage, reverse=True)))

Personally I like namedtuple s but some don't and you said you're limited by your imports (it's in the standard library but nevertheless) so I only included it here ... at the end. 就我个人而言,我喜欢namedtuple但有些人则不喜欢,您说您受到导入的限制(尽管它在标准库中,但仍然如此),所以我只在这里包括了它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM