简体   繁体   中英

Sort generated numbers using another python generator

I'm trying to implement kind of merge sort using a python generator to find the minimum number among generated numbers and generate the next one and here is my sample code:

class GeneratorSort():
    def __init__(self, *args):
        self.values = [(arg.next(), i) for i, arg in enumerate(args)]
        self.generators = args

    def generate(self):
        r, index = min(self.values)
        self.values[index] = self.generators[index].next()
        yield r


def t(l):
    for each in l:
        yield each

l1 = [2, 5, 6, 8]
l2 = [1, 4, 5, 7]
l3 = [0, 3, 9, 10]

a = GeneratorSort(t(l1), t(l2), t(l3))

But when I try to print sort results I got only 0 and next time an error:

>>> for i in a.generate():
        print i
0

And here is the error:

>>> a.generate()
<generator object generate at 0x7fa7bcc37a00>
>>> a.generate().next()

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    a.generate().next()
  File "/home/hamid/projects/bfl/workspace/testo.py", line 10, in generate
    r, index = min(self.values)
TypeError: 'int' object is not iterable
>>> 

I expect from this function to print numbers like 1 , 2 , 3 , 4 , 5 and ... sorted. Is there any other way?

Notice that I need the use of generators.

You are replacing your (value, index) tuples with just the value:

self.values[index] = self.generators[index].next()

You need to replace that with a new tuple:

self.values[index] = (self.generators[index].next(), index)

otherwise the iterable assignment fails; you cannot assign one int to two variables.

Your generator is missing a loop and handling of empty generators:

def generate(self):
    while any(self.values):
        r, index = min(v for v in self.values if v)
        try:
            self.values[index] = (self.generators[index].next(), index)
        except StopIteration:
            self.values[index] = None
        yield r

This sets elements of your self.values list to None to indicate the iterable has been exhausted. This is not the most efficient way to handle this edge case; in a version I wrote before I used a dictionary to track active iterables and simply deleted from that to keep indices (keys) stable.

Note that you can replace your t() function with the built-in iter() function .

Demo:

>>> class GeneratorSort():
...     def __init__(self, *args):
...         self.values = [(arg.next(), i) for i, arg in enumerate(args)]
...         self.generators = args
...     def generate(self):
...         while any(self.values):
...             r, index = min(v for v in self.values if v)
...             try:
...                 self.values[index] = (self.generators[index].next(), index)
...             except StopIteration:
...                 self.values[index] = None
...             yield r
... 
>>> l1 = [2, 5, 6, 8]
>>> l2 = [1, 4, 5, 7]
>>> l3 = [0, 3, 9, 10]
>>> a = GeneratorSort(iter(l1), iter(l2), iter(l3))
>>> list(a.generate())
[0, 1, 2, 3, 4, 5, 5, 6, 7, 8, 9, 10]

The standard library does it more efficiently still with the heapq.merge() function ; it uses a heap to keep the iterables sorted by lowest value in a very efficient manner; min() needs to loop through all K iterables, while using a heap only takes log-K steps to keep the heap invariant intact.

>>> import heapq
>>> list(heapq.merge(l1, l2, l3))
[0, 1, 2, 3, 4, 5, 5, 6, 7, 8, 9, 10]

You can study the source code , which has been highly tuned for maximum performance.

I wrote this simple code using the idea of heapq.merge from Martijn Pieters

import heapq

def g1():
    for i in range(0, 30, 5):
        yield i

def g2():
    for i in range(15, 25, 2):
        yield i

def g3():
    for i in range(5, 30, 3):
        yield i

result_gen = heapq.merge(
    g1(),
    g2(),
    g3(),
)

## convert it to list
print list(result_gen)

## or simply iterate over it
for x in result_gen:
    print x

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM