繁体   English   中英

如何在每次运行中重新评估 Snakemake 规则的参数?

[英]How can I reevaluate params of Snakemake rule in every run?

我有一个玩具 Snakefile:

rule DUMMY:
    output: "foo.txt"
    resources:
        mem_mb=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1)
    params:
        max_mem=lambda wildcards, resources: resources.mem_mb * 1024
    shell:
        """
        echo {resources.mem_mb} {params.max_mem}
        ulimit -v {params.max_mem}
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch {output}
        """

它可以像我期望的那样在snakemake v. 6.6.1 中工作:

$ snakemake -v
6.6.1
$ snakemake foo.txt -j 1 --restart-times 3
Building DAG of jobs...
[...]
1024 1048576
[...]
Trying to restart job 0.
Select jobs to execute...
[...]
2048 2097152
[...]
4096 4194304
[...]
8192 8388608
[Tue May 17 11:56:42 2022]
Finished job 0.
1 of 1 steps (100%) done
[...]

但在 v. 7.3.8 中失败:

$ snakemake -v
7.3.8
$ snakemake foo.txt -j 1 --restart-times 3
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job      count    min threads    max threads
-----  -------  -------------  -------------
DUMMY        1              1              1
total        1              1              1

Select jobs to execute...

[Tue May 17 12:02:34 2022]
rule DUMMY:
    output: foo.txt
    jobid: 0
    resources: tmpdir=/tmp, mem_mb=1024

1024 1048576
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "[...]/lib/python3.7/site-packages/numpy/core/numeric.py", line 204, in ones
    a = empty(shape, dtype, order)
numpy.core._exceptions.MemoryError: Unable to allocate 7.45 GiB for an array with shape (1000000000,) and data type float64
[Tue May 17 12:02:35 2022]
Error in rule DUMMY:
    jobid: 0
    output: foo.txt
    shell:
        
        echo 1024 1048576
        ulimit -v 1048576
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch foo.txt
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Trying to restart job 0.
[...]
2048 1048576
[...]
4096 1048576
[...]
Trying to restart job 0.
Select jobs to execute...

[Tue May 17 12:02:35 2022]
rule DUMMY:
    output: foo.txt
    jobid: 0
    resources: tmpdir=/tmp, mem_mb=8192

8192 1048576
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "[...]/lib/python3.7/site-packages/numpy/core/numeric.py", line 204, in ones
    a = empty(shape, dtype, order)
numpy.core._exceptions.MemoryError: Unable to allocate 7.45 GiB for an array with shape (1000000000,) and data type float64
[Tue May 17 12:02:35 2022]
Error in rule DUMMY:
    jobid: 0
    output: foo.txt
    shell:
        
        echo 8192 1048576
        ulimit -v 1048576
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch foo.txt
      
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-05-17T120234.647036.snakemake.log

resources.mem_mb每次尝试都会更新,而params.max_mem会卡在 1048576。如何强制params.max_mem更新? 它是错误还是功能?

由于params.max_memresources.mem_mb之间的关系非常简单,所以我使用 bash 算法作为 walkarround:

rule DUMMY:
    output: "foo.txt"
    resources:
        mem_mb=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1)
    shell:
        """
        ulimit -v $(( {resources.mem_mb} * 1024))
        echo {resources.mem_mb} $(ulimit -v)
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch {output}
        """

我发布答案,因为有人可能会觉得它有用,但这绝对不是更复杂关系的解决方案。

默认情况下, resources指令将假定给定资源是无限的,因此指定任意资源并在shell中引用它是另一种解决方案:

rule DUMMY:
    output: "foo.txt"
    resources:
        mem_mb=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1),
        max_mem=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1) * 1024,
    shell:
        """
        ulimit -v {resources.max_mem}
        echo {resources.max_mem} $(ulimit -v)
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch {output}
        """

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM