简体   繁体   English

在多个核心上运行Python

[英]Running Python on multiple cores

I have created a (rather large) program that takes quite a long time to finish, and I started looking into ways to speed up the program. 我已经创建了一个(相当大的)程序,需要很长时间才能完成,我开始研究加速程序的方法。

I found that if I open task manager while the program is running only one core is being used. 我发现如果我在程序运行时打开任务管理器,则只使用一个核心。

After some research, I found this website: Why does multiprocessing use only a single core after I import numpy? 经过一些研究,我找到了这个网站: 为什么多重处理在导入numpy后只使用一个核心? which gives a solution of os.system("taskset -p 0xff %d" % os.getpid()) , however this doesn't work for me, and my program continues to run on a single core. 它提供了os.system("taskset -p 0xff %d" % os.getpid())的解决方案os.system("taskset -p 0xff %d" % os.getpid()) ,但这对我不起作用,我的程序继续在单核上运行。

I then found this: is python capable of running on multiple cores? 然后我发现这个: python是否能够在多个内核上运行? , which pointed towards using multiprocessing. ,指向使用多处理。

So after looking into multiprocessing, I came across this documentary on how to use it https://docs.python.org/3/library/multiprocessing.html#examples 因此,在研究了多处理之后,我遇到了关于如何使用它的纪录片https://docs.python.org/3/library/multiprocessing.html#examples

I tried the code: 我试过这段代码:

from multiprocessing import Process

def f(name):
    print('hello', name)

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

a = input("Finished")

After running the code (not in IDLE) It said this: 运行代码后(不是在IDLE中)它说:

Finished
hello bob
Finished

Note: after it said Finished the first time I pressed enter 注意:说完之后我第一次按下回车

So after this I am now even more confused and I have two questions 所以在此之后我现在更加困惑,我有两个问题

First: It still doesn't run with multiple cores (I have an 8 core Intel i7) 第一:它仍然没有多核运行(我有一个8核Intel i7)

Second: Why does it input "Finished" before its even run the if statement code (and it's not even finished yet!) 第二:为什么在它甚至运行if语句代码之前输入“Finished”(它甚至还没有完成!)

To answer your second question first, "Finished" is printed to the terminal because a = input("Finished") is outside of your if __name__ == '__main__': code block. 要先回答第二个问题,“完成”将打印到终端,因为a = input("Finished")超出了if __name__ == '__main__':代码块。 It is thus a module level constant which gets assigned when the module is first loaded and will execute before any code in the module runs. 因此,它是一个模块级常量,在模块首次加载时分配,并在模块中的任何代码运行之前执行。

To answer the first question, you only created one process which you run and then wait to complete before continuing. 要回答第一个问题,您只创建了一个运行的进程,然后等待完成再继续。 This gives you zero benefits of multiprocessing and incurs overhead of creating the new process. 这使您无法获得多处理的好处,并且会产生创建新流程的开销。

Because you want to create several processes, you need to create a pool via a collection of some sort (eg a python list) and then start all of the processes. 因为您要创建多个进程,所以需要通过某种集合(例如python列表)创建池,然后启动所有进程。

In practice, you need to be concerned with more than the number of processors (such as the amount of available memory, the ability to restart workers that crash, etc.). 在实践中,您需要关注的不仅仅是处理器的数量(例如可用内存量,重新启动崩溃的工作人员的能力等)。 However, here is a simple example that completes your task above. 但是,这是一个完成上述任务的简单示例。

import datetime as dt
from multiprocessing import Process, current_process
import sys

def f(name):
    print('{}: hello {} from {}'.format(
        dt.datetime.now(), name, current_process().name))
    sys.stdout.flush()

if __name__ == '__main__':
    worker_count = 8
    worker_pool = []
    for _ in range(worker_count):
        p = Process(target=f, args=('bob',))
        p.start()
        worker_pool.append(p)
    for p in worker_pool:
        p.join()  # Wait for all of the workers to finish.

    # Allow time to view results before program terminates.
    a = input("Finished")  # raw_input(...) in Python 2.

Also note that if you join workers immediately after starting them, you are waiting for each worker to complete its task before starting the next worker. 另请注意,如果您在启动工作人员后立即加入工作人员,那么您在等待每个工作人员完成其任务之后再启动下一个工作人员。 This is generally undesirable unless the ordering of the tasks must be sequential. 除非任务的顺序必须是顺序的,否则这通常是不合需要的。

Typically Wrong 通常是错的

worker_1.start()
worker_1.join()

worker_2.start()  # Must wait for worker_1 to complete before starting worker_2.
worker_2.join()

Usually Desired 通常是渴望的

worker_1.start()
worker_2.start()  # Start all workers.

worker_1.join()
worker_2.join()   # Wait for all workers to finish.

For more information, please refer to the following links: 有关更多信息,请参阅以下链接:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM