简体   繁体   中英

Access nested parameters snakemake

I have a configuration file that looks like this:

params:
- a:
    - sample: sample_A
    - var1: blood_a
- b:
    - sample: sample_b
    - var1: blood_b

Could be more than just a and b in case a more samples are available. How can I work when parameters are nested? I've tried as I would do it with non-nested params; expand, format, lambda but I'm not been successful yet.

It really depends on the context. Could you flesh out how you would like to use this construct in a rule please.

If you have a single input file and a processing rule for that file, but you want to use several different sets of parameters when processing that file, you could set up something like:

param_indices=params["params"].keys()

rule all:
    input:
        expand("results_{param_set}", param_set=param_indices)

rule process_me:
    input:
        "some_file"
    output:
        "results_{param_set}"
    params:
        sample=params[param_set]["sample"],
        var1=params[param_set]["var1"]
    run:
        """
        touch {output}
        """

It might, however, be neater for you to setup your config file slightly different. If it was a tab-separated file

param_index    sample    var1
a              sample_A    blood_a
b              sample_b    blood_b

.. you could use the workflow described here ( https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html#tabular-configuration ).

## config.yaml
sample_file: samples.tsv
## samples.tsv
param_index    sample    var1
a              sample_A    blood_a
b              sample_b    blood_b
## Snakefile
import pandas as pd

configfile: "config.yaml"

samples = pd.read_table(
    config["sample_file"]
).set_index(
    "param_index", drop=False
)

rule all:
    expand(
        "results_{param_set}", param_set=samples["param_index"]
    )

rule process_me:
    input:
        "some_file"
    output:
        "results_{param_set}"
    params:
        sample=params.loc(param_set, "sample"),
        var1=params.loc(param_set, "var1")
    run:
        """
        touch {output}
        """

Admittedly, this isn't the best example of how to use a tabular config. But might make it easier to catch your typos, and type-mismatches, than the yaml format.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM