简体   繁体   English

运行多个python脚本和多处理之间有什么区别

[英]What is the difference between running multiple python scripts and multiprocessing

I was thinking about adding multiprocessing to one of my scripts to increase performance. 我正在考虑在我的一个脚本中添加多处理以提高性能。

It's nothing fancy, there are 1-2 parameters to the main method. 它没什么特别的,主方法有1-2个参数。

Is there anything wrong with just running four of the same scripts cloned on the terminal versus actually adding multiprocessing into the python code? 仅仅运行克隆在终端上的四个相同脚本而不是实际将多处理添加到python代码中有什么问题吗?

ex for four cores: ex为四核:

~$ script.py & 
   script.py &
   script.py &
   script.py;

I've read that linux/unix os's automatically divide the programs amongst the available cores. 我已经读过linux / unix os会自动将程序划分为可用内核。

Sorry if ^ stuff i've mentioned above is totally wrong. 对不起,如果我上面提到的^东西是完全错误的。 Nothing above was formally learned, it was just all online stuff. 上面没有任何东西被正式学习,它只是所有在线的东西。

Martijn Pieters' comment hits the nail on the head, I think. 我认为Martijn Pieters的评论在头上打了一针。 If each of your processes only consumes a small amount of memory (so that you can easily have all four running in parallel without running out of RAM) and if your processes do not need to communicate with each other, it is easiest to start all four processes independently, from the shell, as you suggested. 如果你的每个进程只消耗少量的内存(这样你就可以轻松地让所有四个进程并行运行而不会耗尽RAM),并且如果你的进程不需要相互通信,那么最简单的方法是启动所有四个进程如你所建议的,从shell独立地处理。

The python multiprocessing module is very useful if you have slightly more complex requirements. 如果您的要求稍微复杂一些,python multiprocessing模块非常有用。 You might, for instance, have a program that needs to run in serial at startup, then spawn several copies for the more computationally intensive section, and finally do some post-processing in serial. 例如,您可能有一个程序需要在启动时以串行方式运行,然后为更加计算密集的部分生成多个副本,最后进行一些串行后处理。 multiprocessing would be invaluable for this sort of synchronization. multiprocessing对于这种同步非常有用。

Alternatively, your program might require a large amount of memory (maybe to store a large matrix in scientific computing, or a large database in web programming). 或者,您的程序可能需要大量内存(可能需要在科学计算中存储大型矩阵,或者在Web编程中存储大型数据库)。 multiprocessing lets you share that object among the different processes so that you don't have n copies of the data in memory (using multiprocessing.Value and multiprocessing.Array objects). multiprocessing允许您在不同进程之间共享该对象,这样您就不会在内存中拥有n个数据副本(使用multiprocessing.Valuemultiprocessing.Array对象)。

Using multiprocessing might also become the better solution if you'd want to run your script, say, 100 times on only 4 cores. 如果您想在仅4个核心上运行脚本(例如100次),则使用multiprocessing也可能成为更好的解决方案。 Then your terminal-based approach would become pretty nasty. 然后你的基于终端的方法将变得非常讨厌。

In this case you might want to use a Pool from the multiprocessing module. 在这种情况下,您可能希望使用multiprocessing模块中的Pool

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM