I am trying to create a Snakemake workflow that takes a TSV table configuration which looks something like this:
sample path
s1 /path/to/s1_dir
s2 /path/to/s2_dir
For each sample, I provide a directory, from which I use various paths along the workflow.
I would like to be able to get the various inputs with a single input function. I tried this:
import pandas as pd
samples = pd.read_table('samples.tsv').set_index("sample", drop=False)
rule all:
'...'
def get(wildcards, what):
sample_dir = samples.loc[wildcards.sample, 'path']
if what == 1:
return sample_dir + '/sub/' + 'someInput'
elif what == 2:
return sample_dir + '/sub2/' + 'otherInput'
rule rule1:
input:
get(what=1)
...
rule rule2:
input:
get(what=2)
...
However, this results in an error message, and according to the documentation, input functions may only take a single parameter (wildcards). I guess one workaround would be having multiple input functions:
def get1(wildcards):
sample_dir = samples.loc[wildcards.sample, 'path']
return sample_dir + '/sub/' + 'someInput'
def get2(wildcards):
sample_dir = samples.loc[wildcards.sample, 'path']
return sample_dir + '/sub2/' + 'otherInput'
But what if I have 10 different inputs? Any idea how to do that?
Thanks!
That's what I would do. Combine your custom function get
with a lambda function:
def get(wildcards, what):
# Do stuff with wildcards and what
...
rule one:
input:
lambda wc: get(wc, what= 1)
rule two:
input:
lambda wc: get(wc, what= 2)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.