简体   繁体   English

如何同时运行两个python循环?

[英]How do I run two python loops concurrently?

Suppose I have the following in Python 假设我在Python中具有以下内容

# A loop
for i in range(10000):
    Do Task A

# B loop
for i in range(10000):
    Do Task B

How do I run these loops simultaneously in Python? 如何在Python中同时运行这些循环?

If you want concurrency, here's a very simple example: 如果要并发,这是一个非常简单的示例:

from multiprocessing import Process

def loop_a():
    while 1:
        print("a")

def loop_b():
    while 1:
        print("b")

if __name__ == '__main__':
    Process(target=loop_a).start()
    Process(target=loop_b).start()

This is just the most basic example I could think of. 这只是我能想到的最基本的例子 Be sure to read http://docs.python.org/library/multiprocessing.html to understand what's happening. 请务必阅读http://docs.python.org/library/multiprocessing.html以了解发生了什么。

If you want to send data back to the program, I'd recommend using a Queue (which in my experience is easiest to use). 如果您想将数据发送回程序,我建议使用一个队列(以我的经验,这是最容易使用的)。

You can use a thread instead if you don't mind the global interpreter lock . 如果您不介意全局解释器锁,则可以使用线程代替。 Processes are more expensive to instantiate but they offer true concurrency. 实例化过程比较昂贵,但是它们提供了真正的并发性。

Why do you want to run the two processes at the same time? 为什么要同时运行两个进程? Is it because you think they will go faster (there is a good chance that they wont). 是因为您认为他们会更快(他们很有可能不会)。 Why not run the tasks in the same loop, eg 为什么不在同一循环中运行任务,例如

for i in range(10000):
    doTaskA()
    doTaskB()

The obvious answer to your question is to use threads - see the python threading module. 您问题的明显答案是使用线程-请参阅python 线程模块。 However threading is a big subject and has many pitfalls, so read up on it before you go down that route. 但是线程是一个大主题,有很多陷阱,因此在继续之前,请仔细阅读线程。

Alternatively you could run the tasks in separate proccesses, using the python multiprocessing module. 或者,您可以使用python 多处理模块在单独的过程中运行任务。 If both tasks are CPU intensive this will make better use of multiple cores on your computer. 如果这两项任务都占用大量CPU,则可以更好地利用计算机上的多个内核。

There are other options such as coroutines, stackless tasklets, greenlets, CSP etc, but Without knowing more about Task A and Task B and why they need to be run at the same time it is impossible to give a more specific answer. 还有其他选项,例如协程,无堆栈的tasklet,greenlets,CSP等,但是如果不了解任务A和任务B以及为什么需要同时运行它们,就无法给出更具体的答案。

There are many possible options for what you wanted: 有许多可能的选项供您选择:

use loop 使用循环

As many people have pointed out, this is the simplest way. 正如许多人指出的那样,这是最简单的方法。

for i in xrange(10000):
    # use xrange instead of range
    taskA()
    taskB()

Merits: easy to understand and use, no extra library needed. 优点:易于理解和使用,不需要额外的库。

Drawbacks: taskB must be done after taskA, or otherwise. 缺点:taskB必须在taskA之后执行,否则。 They can't be running simultaneously. 它们不能同时运行。

multiprocess 多进程

Another thought would be: run two processes at the same time, python provides multiprocess library , the following is a simple example: 另一个想法是:同时运行两个进程,python提供了进程 ,以下是一个简单的示例:

from multiprocessing import Process


p1 = Process(target=taskA, args=(*args, **kwargs))
p2 = Process(target=taskB, args=(*args, **kwargs))

p1.start()
p2.start()

merits: task can be run simultaneously in the background, you can control tasks(end, stop them etc), tasks can exchange data, can be synchronized if they compete the same resources etc. 优点:任务可以在后台simultaneously运行,可以控制任务(结束,停止等),任务可以交换数据,如果它们竞争相同的资源则可以同步等。

drawbacks: too heavy!OS will frequently switch between them, they have their own data space even if data is redundant. 缺点:太重了!OS会经常在它们之间切换,即使数据是冗余的,它们也有自己的数据空间。 If you have a lot tasks (say 100 or more), it's not what you want. 如果您有很多任务(例如100或更多),那不是您想要的。

threading 穿线

threading is like process, just lightweight. 线程就像进程,只是轻量级的。 check out this post . 查看这篇文章 Their usage is quite similar: 它们的用法非常相似:

import threading 


p1 = threading.Thread(target=taskA, args=(*args, **kwargs))
p2 = threading.Thread(target=taskB, args=(*args, **kwargs))

p1.start()
p2.start()

coroutines 协程

libraries like greenlet and gevent provides something called coroutines, which is supposed to be faster than threading. 诸如greenletgevent类的库提供了称为协程的东西,它应该比线程处理要快。 No examples provided, please google how to use them if you're interested. 没有提供示例,如果您有兴趣,请google如何使用它们。

merits: more flexible and lightweight 优点:更灵活,更轻便

drawbacks: extra library needed, learning curve. 缺点:需要额外的库,学习曲线。

from threading import Thread
def loopA():
    for i in range(10000):
        #Do task A
def loopB():
    for i in range(10000):
        #Do task B
threadA = Thread(target = loopA)
threadB = Thread(target = loobB)
threadA.run()
threadB.run()
# Do work indepedent of loopA and loopB 
threadA.join()
threadB.join()

How about: A loop for i in range(10000): Do Task A, Do Task B ? 怎么样:范围为(10000)的i的循环:是任务A,任务B吗? Without more information i dont have a better answer. 没有更多信息,我没有更好的答案。

您可以使用线程或多处理

I find that using the "pool" submodule within "multiprocessing" works amazingly for executing multiple processes at once within a Python Script. 我发现在“多重处理”中使用“池”子模块可以惊人地在Python脚本中一次执行多个进程。

See Section: Using a pool of workers 请参阅部分:使用工人池

Look carefully at "# launching multiple evaluations asynchronously may use more processes" in the example. 仔细查看示例中的“#异步启动多个评估可能会使用更多进程”。 Once you understand what those lines are doing, the following example I constructed will make a lot of sense. 一旦了解了这些行的功能,下面构建的示例将很有意义。

import numpy as np
from multiprocessing import Pool

def desired_function(option, processes, data, etc...):
    # your code will go here. option allows you to make choices within your script
    # to execute desired sections of code for each pool or subprocess.

    return result_array   # "for example"


result_array = np.zeros("some shape")  # This is normally populated by 1 loop, lets try 4.
processes = 4
pool = Pool(processes=processes)
args = (processes, data, etc...)    # Arguments to be passed into desired function.

multiple_results = []
for i in range(processes):          # Executes each pool w/ option (1-4 in this case).
    multiple_results.append(pool.apply_async(param_process, (i+1,)+args)) # Syncs each.

results = np.array(res.get() for res in multiple_results)  # Retrieves results after
                                                           # every pool is finished!

for i in range(processes):
    result_array = result_array + results[i]  # Combines all datasets!

The code will basically run the desired function for a set number of processes. 该代码基本上将为一组进程运行所需的功能。 You will have to carefully make sure your function can distinguish between each process (hence why I added the variable "option".) Additionally, it doesn't have to be an array that is being populated in the end, but for my example, that's how I used it. 您必须仔细确保您的函数可以区分每个进程(因此,为什么要添加变量“ option”。)此外,它不必是最后要填充的数组,但对于我的示例,这就是我使用它的方式。 Hope this simplifies or helps you better understand the power of multiprocessing in Python! 希望这可以简化或帮助您更好地了解Python中多处理的功能!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM