简体   繁体   中英

Is there a way to cast values when using operator.itemgetter() as sort key?

I've a list of lists containing string represented numbers:

nums = [['1','3'],['2','2'],['1','2'],['0','2'],['11','2']]

I need to sort these numerically ascending by the first then second entries without modify the string representation of the numbers in the original list. Also, want to avoid creating another 2nd copy of the list with everything mapped to integers explicitly--imagine this is a huge list.

Both sort() and sorted() works nicely with tuples and lists, so with a lambda key, I can do the following:

>>> sorted(nums, key=lambda n: (int(n[0]),int(n[1])) 
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]

Happy days...

However, I've seen a number of postings about sorting being faster using the operator.itemgetter() as key function over using a lambda. Without getting into a discussion of the validity of these claims , does anyone if it's possible apply a conversion from string to integer for comparision when using operator.itemgetter() :

The following obviously fails, due to the strings being compared as strings, not numbers:

>>> sorted(nums, key=operator.itemgetter(0,1)) 
[['0', '2'], ['1', '2'], ['1', '3'], ['11', '2'], ['2', '2']]

operator.itemgetter is fast not because it does something special in sort , but because it is entirely written in c , and does not involve calling a python function.

So what you're looking for is a C function that does what you want - itemgetter is a red herring.

In python 2, you can avoid calling pure-python functions with key=functools.partial(map, int) . This won't work in python 3, because map no longer returns a list or tuple. This also might not be any faster than your solution.

There are ways, for example using iteration_utilities.chained 1 and functools.partial :

>>> import operator import itemgetter
>>> from iteration_utilities import chained
>>> from functools import partial

>>> itemgetter_int = chained(operator.itemgetter(0, 1), partial(map, int), tuple)
>>> sorted(nums, key=itemgetter_int)
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]

It works but it's definetly slower than using a lambda or custom defined function.

If you really need speed you could cythonize the lambda -function (or write it in C by hand) but if you just need it in one place just use a throw-away lambda . Especially if it's in sorted because it has O(nlog(n)) comparisons so the O(n) function calls probably don't contribute much to the overall execution time.


1: This is a function in a 3rd party extension module that I authored. It needs to be seperatly installed, for example via conda or pip .

And 4 years later here is the solution I think best fits the spirit of the question:

>>> import operator
>>> ig = operator.itemgetter(0,1)
>>> nums = [['1', '3'], ['2', '2'], ['1', '2'], ['0', '2'], ['11', '2']]
>>> sorted( nums,key = lambda x : tuple(map(int,ig(x))) )
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM