简体   繁体   English

使用不同参数值多次运行 python 脚本的最佳方式

[英]Optimal way to run a python script multiple time with different argument values

I have a python script that takes as input ~20 arguments.我有一个 python 脚本,它接受约 20 个参数作为输入。 I want to run this script multiple times with different values for the arguments each time.我想多次运行此脚本,每次使用不同的参数值。 At the moment I use a basic bash script like the following (with more parameters and more different values for each parameter)目前我使用如下的基本 bash 脚本(每个参数有更多参数和更多不同的值)

for com_adv_par18 in 0.288 0.289
do
  for com_adv_par19 in 0.288 0.289
  do
    for com_adv_par20 in 0.288 0.289
    do
    python alpha2.py $com_adv_par18 $com_adv_par19 $com_adv_par20
    done
  done
 done

I am worrying though that this is not the most optimal way to do it.我担心这不是最好的方法。 Both coding and computing time wise .编码和计算时间明智。 Could you propose any alternative method to insert the parameters and run the program more efficiently?您能否提出任何替代方法来插入参数并更有效地运行程序?

Thanks in advance.提前致谢。

The answer to your question depends on a lot of things - a significant factor is the length of time each execution takes.您的问题的答案取决于很多事情 - 一个重要因素是每次执行所需的时间长度。

If you can refactor the alpha2.py script so that you can import it, then you could use a python wrapper script along these lines:如果您可以重构alpha2.py脚本以便可以import它,那么您可以使用 python 包装器脚本,如下所示:

from alpha2 import do_something
from itertools import product

# define argument lists here, e.g. list1 = [0.288, 0.289], etc.

for args in product(list1, list2, list3):
    do_something(*args)

Each execution will still be sequential but the advantage of this approach is that you don't suffer the overhead of loading a new python instance for every combination of parameters.每次执行仍将是顺序的,但这种方法的优点是您不会为每个参数组合加载新的 python 实例而产生开销。

Why not use another python script to process args and call the initial script as you want?为什么不使用另一个 python 脚本来处理 args 并根据需要调用初始脚本? See such threads as Run a python script from another python script, passing in args查看诸如从另一个 python 脚本运行 python 脚本,传入 args之类的线程

It really depends on what you want to optimize.这实际上取决于您要优化的内容。 Running multiple Python instances on a multiprocessor system will allow you to utilize CPU parallelism in a way you currently can't with a single Python instance , so from that perspective, your script may be just right, though you really should fix the broken quoting .在多处理器系统上运行多个 Python 实例将允许您以目前无法使用单个 Python 实例的方式利用 CPU 并行性,因此从这个角度来看,您的脚本可能是正确的,尽管您确实应该修复损坏的引用

I also took the liberty to shorten the variable names, add output redirection to a file, and add the & background operator to run the jobs in parallel.我还冒昧地缩短了变量名称,将输出重定向添加到文件,并添加&后台运算符以并行运行作业。 If you have a lot of combinations, you might want to limit how many you try to run at once, but here, it should be manageable with just the OS scheduler's limited IQ.如果您有很多组合,您可能希望限制一次尝试运行的数量,但在这里,它应该可以通过操作系统调度程序的有限 IQ 进行管理。

for par18 in 0.288 0.289
do
  for par19 in 0.288 0.289
  do
    for par20 in 0.288 0.289
    do
      python alpha2.py "$par18" "$par19" "$par20" >"output_${par18}_${par_19}_${par20}.out" &
    done
  done
done

For controlling the number of parallel instances you run at any given time, explore xargs , which is standard but rather basic (and its -P option is a GNU extension, so commonly available on Linux, but not POSIX and thus not portable to other systems), and clunky to use for looping over a set of combinations of values, and GNU parallel , which is usually a third-party install, whose command-line interface for this sort of thing is rich and expressive.要控制您在任何给定时间运行的并行实例的数量,请探索xargs ,它是标准的但相当基本的(并且它的-P选项是 GNU 扩展,因此在 Linux 上很常见,但不是 POSIX,因此不能移植到其他系统),并且笨拙地用于循环一组值组合,以及GNU parallel ,它通常是第三方安装,其用于此类事情的命令行界面丰富且富有表现力。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM