简体   繁体   English

多线程Python脚本比非线程脚本花费更长时间

[英]Multithreaded Python script taking longer than non-threaded script

Disclaimer: I'm pretty terrible with multithreading, so it's entirely possible I'm doing something wrong. 免责声明:我对多线程非常糟糕,所以我完全有可能做错了。

I've written a very basic raytracer in Python, and I was looking for ways to possibly speed it up. 我在Python中编写了一个非常基本的光线跟踪器,我一直在寻找可能加速它的方法。 Multithreading seemed like an option, so I decided to try it out. 多线程似乎是一种选择,所以我决定尝试一下。 However, while the original script took ~85 seconds to process my sample scene, the multithreaded script ends up taking ~125 seconds, which seems pretty unintuitive. 但是,虽然原始脚本需要大约85秒来处理我的示例场景,但多线程脚本最终需要大约125秒,这看起来非常不直观。

Here's what the original looks like (I'm not going to copy the drawing logic and stuff in here. If someone thinks that would be needed to figure out the problem, I'll go ahead and put it back in): 这是原始的样子(我不会复制绘图逻辑和东西。如果有人认为需要找出问题,我会继续把它放回去):

def getPixelColor(x, y, scene):
    <some raytracing code>

def draw(outputFile, scene):
    <some file handling code>
    for y in range(scene.getHeight()):
        for x in range(scene.getWidth()):
            pixelColor = getPixelColor(x, y, scene)
            <write pixelColor to image file>

if __name__ == "__main__":
    scene = readScene()
    draw(scene)

And here's the multithreaded version: 这是多线程版本:

import threading
import Queue

q = Queue.Queue()
pixelDict = dict()

class DrawThread(threading.Thread):
    def __init__(self, scene):
        self.scene = scene
        threading.Thread.__init__(self)

    def run(self):
        while True:
        try:
            n, x, y = q.get_nowait()
        except Queue.Empty:
            break
        pixelDict[n] = getPixelColor(x, y, self.scene)
        q.task_done()

    def getPixelColor(x, y, scene):
        <some raytracing code>

def draw(outputFile, scene):
    <some file handling code>
    n = 0
    work_threads = 4
    for y in range(scene.getHeight()):
        for x in range(scene.getWidth()):
            q.put_nowait((n, x, y))
            n += 1
    for i in range(work_threads):
        t = DrawThread(scene)
        t.start()
    q.join()
    for i in range(n)
        pixelColor = pixelDict[i]
        <write pixelColor to image file>

if __name__ == "__main__":
    scene = readScene()
    draw(scene)

Is there something obvious that I'm doing wrong? 有什么明显的东西我做错了吗? Or am I incorrect in assuming that multithreading would give a speed boost to a process like this? 或者我错误地假设多线程会对这样的进程提速?

I suspect the Python Global Interpreter Lock is preventing your code from running in two threads at once. 我怀疑Python Global Interpreter Lock会阻止您的代码同时在两个线程中运行。

What is a global interpreter lock (GIL)? 什么是全球翻译锁(GIL)?

Clearly you want to take advantage of multiple CPUs. 显然,您希望利用多个CPU。 Can you split the ray tracing across processes instead of threads? 你能跨进程而不是线程分割光线跟踪吗?

The multithreaded version obviously does more "work" so I would expect it to be slower on a single CPU. 多线程版本显然做了更多“工作”,所以我希望它在单个CPU上更慢。

I also dislike subclassing Thread , and just construct a new thread with t = Thread(target=myfunc); t.run() 我也不喜欢继承Thread ,只是用t = Thread(target=myfunc); t.run()构造一个新线程t = Thread(target=myfunc); t.run() t = Thread(target=myfunc); t.run()

To directly answer your question, Python threading will not improve the performance and the GIL may actually make it worse. 要直接回答您的问题,Python线程不会提高性能,GIL实际上可能会使情况变得更糟。

In the larger scheme of things, I love python and ray tracing, but you should never combine them. 在更大的方案中,我喜欢python和光线追踪,但你永远不应该将它们结合起来。 A Python ray tracer would be at least 2 orders of magnitude slower than a C or even C++ version of the same. Python光线跟踪器的速度至少比C或同等版本的C ++版本慢2个数量级。

So while your question is interesting from a Python programmers point of view, it is rather funny from a ray tracing point of view. 因此,从Python程序员的角度来看,您的问题很有趣,从光线跟踪的角度来看,这是相当有趣的。

I suspect you may have one of two problems (or both really). 我怀疑你可能有两个问题之一(或两者都是)。 First, I agree with Joe that the Global Interpreter Lock is likely causing problems. 首先,我同意Joe的看法, Global Interpreter Lock可能会引发问题。

Second, it looks like you write a file a lot during this process (particularly in the non-threaded version when you do it every iteration of the inner loop). 其次,看起来你在这个过程中经常编写一个文件(特别是在内部循环的每次迭代时都是非线程版本)。 Is it possible that you were time-bound on the disk not CPU? 您是否可能在磁盘上有时间限制而不是CPU? If so, then when you added the threading you added overhead to manage the threads without resolving the actual bottleneck. 如果是这样,那么当您添加线程时,您增加了管理线程的开销,而没有解决实际的瓶颈问题。 When optimizing make sure you identify you bottlenecks first, so you can at least guess about which are likely to give you the most bang for your buck when addressing them. 在进行优化时,请确保首先确定您的瓶颈,这样您至少可以猜测哪些可能会在解决这些问题时为您带来最大的收益。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM