简体   繁体   English

按字段名称对命名元组列表进行排序的 Pythonic 方法

[英]Pythonic way to sorting list of namedtuples by field name

I want to sort a list of named tuples without having to remember the index of the fieldname.我想对命名元组列表进行排序,而不必记住字段名的索引。 My solution seems rather awkward and was hoping someone would have a more elegant solution.我的解决方案似乎很尴尬,希望有人能有一个更优雅的解决方案。

from operator import itemgetter
from collections import namedtuple

Person = namedtuple('Person', 'name age score')
seq = [
    Person(name='nick', age=23, score=100),
    Person(name='bob', age=25, score=200),
]

# sort list by name
print(sorted(seq, key=itemgetter(Person._fields.index('name'))))
# sort list by age
print(sorted(seq, key=itemgetter(Person._fields.index('age'))))

Thanks, Nick谢谢,尼克

from operator import attrgetter
from collections import namedtuple

Person = namedtuple('Person', 'name age score')
seq = [Person(name='nick', age=23, score=100),
       Person(name='bob', age=25, score=200)]

Sort list by name按名称排序列表

sorted(seq, key=attrgetter('name'))

Sort list by age按年龄排序列表

sorted(seq, key=attrgetter('age'))
sorted(seq, key=lambda x: x.name)
sorted(seq, key=lambda x: x.age)

I tested the two alternatives given here for speed, since @zenpoy was concerned about performance.我测试了这里给出的两个替代方案以提高速度,因为@zenpoy 关注性能。

Testing script:测试脚本:

import random
from collections import namedtuple
from timeit import timeit
from operator import attrgetter

runs = 10000
size = 10000
random.seed = 42
Person = namedtuple('Person', 'name,age')
seq = [Person(str(random.randint(0, 10 ** 10)), random.randint(0, 100)) for _ in range(size)]

def attrgetter_test_name():
    return sorted(seq.copy(), key=attrgetter('name'))

def attrgetter_test_age():
    return sorted(seq.copy(), key=attrgetter('age'))

def lambda_test_name():
    return sorted(seq.copy(), key=lambda x: x.name)

def lambda_test_age():
    return sorted(seq.copy(), key=lambda x: x.age)

print('attrgetter_test_name', timeit(stmt=attrgetter_test_name, number=runs))
print('attrgetter_test_age', timeit(stmt=attrgetter_test_age, number=runs))
print('lambda_test_name', timeit(stmt=lambda_test_name, number=runs))
print('lambda_test_age', timeit(stmt=lambda_test_age, number=runs))

Results:结果:

attrgetter_test_name 44.26793992166096
attrgetter_test_age 31.98247099677627
lambda_test_name 47.97959511074551
lambda_test_age 35.69356267603864

Using lambda was indeed slower.使用 lambda 确实更慢。 Up to 10% slower.最多慢 10%。

EDIT :编辑

Further testing shows the results when sorting using multiple attributes.进一步的测试显示了使用多个属性进行排序时的结果。 Added the following two test cases with the same setup:添加了以下两个具有相同设置的测试用例:

def attrgetter_test_both():
    return sorted(seq.copy(), key=attrgetter('age', 'name'))

def lambda_test_both():
    return sorted(seq.copy(), key=lambda x: (x.age, x.name))

print('attrgetter_test_both', timeit(stmt=attrgetter_test_both, number=runs))
print('lambda_test_both', timeit(stmt=lambda_test_both, number=runs))

Results:结果:

attrgetter_test_both 92.80101586919373
lambda_test_both 96.85089983147456

Lambda still underperforms, but less so. Lambda 仍然表现不佳,但没那么严重。 Now about 5% slower.现在大约慢了 5%。

Testing is done on Python 3.6.0.测试在 Python 3.6.0 上完成。

since nobody mentioned using itemgetter(), here how you do using itemgetter().因为没有人提到使用 itemgetter(),这里你如何使用 itemgetter()。

from operator import itemgetter
from collections import namedtuple

Person = namedtuple('Person', 'name age score')
seq = [
    Person(name='nick', age=23, score=100),
    Person(name='bob', age=25, score=200),
]

# sort list by name
print(sorted(seq, key=itemgetter(0)))

# sort list by age
print(sorted(seq, key=itemgetter(1)))

This might be a bit too 'magical' for some, but I'm partial to:这对某些人来说可能有点太“神奇”了,但我偏向于:

# sort list by name
print(sorted(seq, key=Person.name.fget))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM