简体   繁体   中英

Can I use multiple conda environments in the nextflow config?

I'm writing a pipeline in Nextflow and want to use multiple different conda (existing) environments to avoid inconsistencies in tool installation and for sharing specific modules of the pipeline. The Nextflow docs state that the best practise is to specify the conda environment in the nextflow.config - see here. . However, the declaration is just process.conda and seems to apply to all processes rather than being process specific.

I know I can just specify an existing conda environment in each process but I'm trying to adhere to the best practises for portability.

As I haven't been able to find any documentation online for this specific issue, I have tried the following declarations in the config file:

profiles {
    conda {
        process.conda = "something" // works but single env for all processes
        fastqc.conda = "something" // where fastqc is the name of the process - FAILS
        process.fastqc.conda = "something" // FAILS
    }
}

I have also tried:

profiles {
    conda {
        process {
            withName: fastqc {
                 process.conda = "something"
            }
        }
    }
}

which also fails with the error: unknown config attribute withName

Interestingly,

process {
        conda {
            withName: fastqc {
                 process.conda = "something"
            }
        }
    }

does allow me to run different conda environments for each process but cannot be turned on and off by the -profile option (because specifying a profile block breaks it).

Not sure if there's a "best practice" exactly, but the usual way I think is to create a separate Conda configuration file and use the withName or withLabel process selectors to specify the environment using the conda directive. For example, the contents of conf/conda.config might look like:

process {

    withLabel: 'fastqc' {
        conda = 'fastqc=0.11.8=1'
    }

    withName: 'cutadapt' {
        conda = 'cutadapt=2.10=py37h516909a_0'
    }
}

Then, in your nextflow.config , include a 'conda' profile to include the above configuration file and enable the use of Conda environments. Note that the latter is now required in newer versions of Nextflow:

includeConfig 'conf/base.config'

profiles {

    'conda' {
        includeConfig 'conf/conda.config'
        conda.enabled = true
    }

In the above example, the conf/base.config would always be applied, regardless of profile, and might contain the usual cpus / memory / time directives and errorStrategy etc.

I also couldn't find how to do this through nextflow.config .

The workaround I've used is to keep attaching the conda directive directive to each process , and then have this in nextflow.config :

profiles {
    conda {
        conda.enabled = true
    }
}

That way you can still switch conda-usage off with -profile flag on CLI.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM