简体   繁体   中英

Why is sorting a python list of tuples faster when I explicitly provide the key as the first element?

Sorting a list of tuples (dictionary keys,values pairs where the key is a random string) is faster when I do not explicitly specify that the key should be used ( edit : added operator.itemgetter(0) from comment by @Chepner and the key version is now faster!):

import timeit

setup ="""
import random
import string

random.seed('slartibartfast')
d={}
for i in range(1000):
    d[''.join(random.choice(string.ascii_uppercase) for _ in range(16))] = 0
"""
print min(timeit.Timer('for k,v in sorted(d.iteritems()): pass',
        setup=setup).repeat(7, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=lambda x: x[0]): pass',
        setup=setup).repeat(7, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass',
        setup=setup).repeat(7, 1000))

Gives:

0.575334150664
0.579534521128
0.523808984422 (the itemgetter version!)

If however I create a custom object passing the key=lambda x: x[0] explicitly to sorted makes it faster:

setup ="""
import random
import string

random.seed('slartibartfast')
d={}

class A(object):
    def __init__(self):
        self.s = ''.join(random.choice(string.ascii_uppercase) for _ in
              range(16))
    def __hash__(self): return hash(self.s)
    def __eq__(self, other):
        return self.s == other.s
    def __ne__(self, other): return self.s != other.s
    # def __cmp__(self, other): return cmp(self.s ,other.s)

for i in range(1000):
    d[A()] = 0
"""
print min(timeit.Timer('for k,v in sorted(d.iteritems()): pass',
        setup=setup).repeat(3, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=lambda x: x[0]): pass',
        setup=setup).repeat(3, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass',
        setup=setup).repeat(3, 1000))

Gives:

4.65625458083
1.87191002252
1.78853626684

Is this expected ? Seems like second element of the tuple is used in the second case but shouldn't the keys compare unequal ?

Note: uncommenting the comparison method gives worse results but still the times are at one half:

8.11941771831
5.29207000173
5.25420037046

As expected built in (address comparison) is faster.

EDIT : here are the profiling results from my original code that triggered the question - without the key method:

         12739 function calls in 0.007 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
        1    0.000    0.000    0.007    0.007 __init__.py:6527(_refreshOrder)
        1    0.002    0.002    0.006    0.006 {sorted}
     4050    0.003    0.000    0.004    0.000 bolt.py:1040(__cmp__) # here is the custom object
     4050    0.001    0.000    0.001    0.000 {cmp}
     4050    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'sort' of 'list' objects}
      291    0.000    0.000    0.000    0.000 __init__.py:6537(<lambda>)
      291    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 bolt.py:1240(iteritems)
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

and here are the results when I specify the key:

         7027 function calls in 0.004 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.004    0.004 <string>:1(<module>)
        1    0.000    0.000    0.004    0.004 __init__.py:6527(_refreshOrder)
        1    0.001    0.001    0.003    0.003 {sorted}
     2049    0.001    0.000    0.002    0.000 bolt.py:1040(__cmp__)
     2049    0.000    0.000    0.000    0.000 {cmp}
     2049    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'sort' of 'list' objects}
      291    0.000    0.000    0.000    0.000 __init__.py:6538(<lambda>)
      291    0.000    0.000    0.000    0.000 __init__.py:6533(<lambda>)
      291    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 bolt.py:1240(iteritems)
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Apparently it is the __cmp__ and not the __eq__ method that is called ( edit cause that class defines __cmp__ but not __eq__ , see here for the order of resolution of equal and compare).

In the code here __eq__ method is indeed called (8605 times) as seen by adding debug prints (see the comments ).

So the difference is as stated in the answer by @chepner. The last thing I am not quite clear on is why are those tuple equality calls needed (IOW why we need to call eq and we don't call cmp directly).

FINAL EDIT: I asked this last point here: Why in comparing python tuples of objects is __eq__ and then __cmp__ called? - turns out it's an optimization, tuple's comparison calls __eq__ in the tuple elements, and only call cmp for not eq tuple elements. So this is now perfectly clear. I thought it called directly __cmp__ so initially it seemed to me that specifying the key is just unneeded and after Chepner's answer I was still not getting where the equal calls come in.

Gist: https://gist.github.com/Utumno/f3d25e0fe4bd0f43ceb9178a60181a53

There are two issues at play.

  1. Comparing two values of builtin types (such as int ) happens in C. Comparing two values of a class with an __eq__ method happens in Python; repeatedly calling __eq__ imposes a significant performance penalty.

  2. The function passed with key is called once per element , rather than once per comparison. This means that lambda x: x[0] is called once to build a list of A instances to be compared. Without key , you need to make O(n lg n) tuple comparisons, each of which requires a call to A.__eq__ to compare the first element of each tuple.

The first explains why your first pair of results is under a second while the second takes several seconds. The second explains why using key is faster regardless of the values being compared.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM