[英]Snakemake: expanding only a subset of wildcards
I can't find the solution for this probably easy problem:我找不到这个可能很简单的问题的解决方案:
I have this snakefile, which first produces the following files:我有这个蛇文件,它首先生成以下文件:
data/sample1_P1.txt数据/sample1_P1.txt
data/sample1_P2.txt数据/sample1_P2.txt
data/sample2_P1.txt数据/sample2_P1.txt
data/sample2_P2.txt数据/sample2_P2.txt
In the next step, it just concatenates the files to one file concatenated/concatenated.txt
.在下一步中,它只是将文件连接到一个文件concatenated/concatenated.txt
。
This is the minimal, reproducible example:这是最小的、可重现的示例:
pairs = {"P1" : "P1", "P2" : "P2"}
samples = {
"sample1": "sample1",
"sample2": "sample2"
}
rule all:
input: "concatenated/concatenated.txt"
rule get_txt_files:
output:
"data/{sample}_{pair}.txt"
shell:
"""
echo 1 > {output}
"""
rule concatenate:
input:
expand("data/{sample}_{pair}.txt", sample=samples, \
pair=pairs)
output:
"concatenated/concatenated.txt"
shell:
"cat {input} > {output};"
My question is simple: How can I modify the rule concatenate
, so that it concatenates the files with the same sample name?我的问题很简单:如何修改规则concatenate
,以便连接具有相同示例名称的文件?
Desired output would be:所需的 output 将是:
concatenated/sample1.txt串联/sample1.txt
concatenated/sample2.txt串联/sample2.txt
Any help would be appreciated.任何帮助,将不胜感激。
EDIT编辑
I have a very similar follow-up question, so I don't think it's necessary to open a new question again:我有一个非常相似的后续问题,所以我认为没有必要再次提出新问题:
What if my expected output would be as follows:如果我预期的 output 如下所示:
data/sample1/sample1_P1数据/样本1/样本1_P1
data/sample1/sample1_P2数据/样本1/样本1_P2
data/sample2/sample2_P1数据/样本2/样本2_P1
data/sample2/sample2_P2数据/样本2/样本2_P2
To be clear: I only want to create a new direcotry and move the files into that bespoke direcoty.需要明确的是:我只想创建一个新的目录并将文件移动到该定制的目录中。
It seemed intuitive to do it like this:这样做似乎很直观:
pairs = {"P1" : "P1", "P2" : "P2"}
samples = {
"sample1": "sample1",
"sample2": "sample2"
}
rule all:
input: expand("data/{sample}/{sample}_{pair}.txt", sample=samples, pair = pairs)
rule get_txt_files:
output:
"data/{sample}_{pair}.txt"
shell:
"""
echo 1 > {output}
"""
rule reorganise:
input:
expand("data/{{sample}}_{pair}.txt", \
pair=pairs)
output:
"data/{sample}/{sample}_{pair}.txt"
shell:
"mv {input} data/{wildcards.sample}/.;"
Can you spot the problem?你能发现问题吗?
Thanks a lot in advance非常感谢提前
rule concatenate:
input:
expand("data/{{sample}}_{pair}.txt", pair=pairs)
output:
"concatenated/{sample}.txt"
shell:
"cat {input} > {output};"
Answer to q in comment:在评论中回答 q:
from snakemake.io import expand # automatically imported in Snakemake
expand("data/{{sample}}_{pair}.txt", pair="A B C".split())
# ['data/{sample}_A.txt', 'data/{sample}_B.txt', 'data/{sample}_C.txt']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.