简体   繁体   English

在python中多处理一个shell脚本

[英]Multi-processing a shell script within python

My Requirement is to run a shell function or script in parallel with multi-processing.我的要求是与多处理并行运行 shell 函数或脚本。 Currently I get it done with the below script that doesn't use multi-processing.目前,我使用以下不使用多处理的脚本来完成它。 Also when I start 10 jobs in parallel, one job might get completed early and has to wait for the other 9 jobs to complete.此外,当我并行启动 10 项工作时,一项工作可能会提前完成,并且必须等待其他 9 项工作完成。 I wanted eliminate this with the help of multiprocessing in python.我想借助 python 中的多处理来消除这种情况。

i=1 
total=`cat details.txt  |wc -l`
while [ $i -le $total ]
do
name=`cat details.txt | head -$i | tail -1 | awk '{print $1}'
age=`cat details.txt | head -$i | tail -1 | awk '{print $2}'
./new.sh $name $age  &
   if (( $i % 10 == 0 )); then wait; fi
done
wait

I want to run ./new.sh $name $age inside a python script with multiprocessing enabled(taking into account the number of cpu) As you can see the value of $name and $age would change in each execution.我想在启用多处理的python脚本中运行./new.sh $name $age (考虑到cpu的数量)正如您所看到的, $name 和 $age 的值在每次执行中都会发生变化。 Kindly share your thoughts请分享您的想法

First, your whole schell script could be replaced with:首先,您的整个 schell 脚本可以替换为:

awk '{ print $1; print $2; }' details.txt | xargs -d'\n' -n 2 -P 10 ./new.sh

A simple python solution would be:一个简单的python解决方案是:

from subprocess import check_call
from multiprocessing.dummy import Pool

def call_script(args):
    name, age = args  # unpack arguments
    check_call(["./new.sh", name, age])

def main():
    with open('details.txt') as inputfile:
        args = [line.split()[:2] for line in inputfile]
    pool = Pool(10)
    # pool = Pool()  would use the number of available processors instead
    pool.map(call_script, args)
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

Note that this uses multiprocessing.dummy.Pool (a thread pool) to call the external script, which in this case is preferable to a process pool, since all the call_script method does is invoke the script and wait for its return.请注意,这使用multiprocessing.dummy.Pool (线程池)来调用外部脚本,在这种情况下,它比进程池更可取,因为call_script方法所做的只是调用脚本并等待其返回。 Doing that in a worker process instead of a worker thread wouldn't increase performance since this is an IO based operation.在工作进程而不是工作线程中执行此操作不会提高性能,因为这是基于 IO 的操作。 It would only increase the overhead for process creation and interprocess communication.它只会增加进程创建和进程间通信的开销。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM