简体   繁体   English

在 python 中使用生成器制作惰性流

[英]Using generators to make lazy streams in python

I've been messing with streams in python and been trying to generate the Hamming numbers (all the numbers with prime factors of 2, 3, or 5 only).我一直在搞乱 python 中的流,并一直在尝试生成汉明数(所有素数仅为 2、3 或 5 的数字)。 The standard way for doing so, described by Dijkstra, is to observe that: Dijkstra 描述的这样做的标准方法是观察:

  1. The sequence of Hamming numbers begins with 1.汉明数列从 1 开始。
  2. The remaining values in the sequence are of the form 2h, 3h, and 5h, where h is any Hamming number.序列中的其余值的形式为 2h、3h 和 5h,其中 h 是任何汉明数。
  3. h is be generated by outputting the value 1, and then merging together 2h, 3h, and 5h h是通过输出值1生成的,然后合并2h、3h、5h

My implementation is this:我的实现是这样的:

def hamming():
    yield 1
    yield from merge(scale_stream(hamming(), 2), scale_stream(hamming(), 3))

def merge(s1, s2):
  x1, x2 = next(s1), next(s2)
  while True:
    if x1 < x2:
        yield x1
        x1 = next(s1)
    elif x1 > x2:
        yield x2
        x2 = next(s2)
    else:
        yield x1
        x1, x2 = next(s1), next(s2)

def scale_stream(stream, scalar):
    for e in stream:
        yield e * scalar

def stream_index(stream, n):
    for i, e in enumerate(stream):
        if i+1 == n:
            return e

print(stream_index(hamming(), 300))

This does correctly produce the stream of Hamming numbers, however for whatever reason it takes more and more time the longer it generates, even though in theory the time complexity should be O(N).这确实正确地产生了 stream 的汉明数,但是无论出于何种原因,它产生的时间越长,花费的时间就越长,即使理论上时间复杂度应该是 O(N)。

I have played around with other streams before but my intuition for them is pretty weak so I have no idea what is going on here.我以前玩过其他流,但我对它们的直觉很弱,所以我不知道这里发生了什么。 I think the issue is in the recursive way I defined hamming();我认为问题在于我定义的递归方式 hamming(); I don't know if it is an issue that every call to hamming might spawn a new version of the process that has to run in parallel thereby slowing it down.我不知道每次调用 hamming 都可能产生必须并行运行的新版本进程从而减慢速度是否是一个问题。

Honestly though like I said I have a very poor idea of what actually happens when I run it and debugging has gotten me nowhere, so if someone with more experience can enlighten me I would really appreciate it.老实说,尽管就像我说的那样,我对运行它时实际发生的情况知之甚少,调试也无济于事,因此,如果有更多经验的人可以启发我,我将不胜感激。

The further you get out into your stream, the more duplicates you're going to have to be merging.您进入 stream 的距离越远,您需要合并的重复项就越多。 The number 2**4 * 3 **4 = 1296 is going to appear 70 times in your multiple stream (8 choose 4), and your program is going to be spending more time merging duplicates than it is outputting new items.数字 2**4 * 3 **4 = 1296 将在您的倍数 stream 中出现 70 次(8 选 4),并且您的程序将花费更多时间合并重复项而不是输出新项。

The further you go out, the more duplication you'r going to be dealing with.你 go 越远,你要处理的重复越多。 There is no reason to expect your program to be linear.没有理由期望您的程序是线性的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM