简体   繁体   中英

Snakemake : SFTP on local machine

I am connected with ssh on a remote server from my local machine.
I run my Snakemake on the remote server.
I would like to use as input of a rule, a file that is on my local machine.
Of course, since I run my Snakemake on the server, the server become the local machine and the local machine the remote one (for Snakemake).

from snakemake.remote.SFTP import RemoteProvider

# I am not sure about the private key, is it the one I have on the server ?
# I have the same result with or without private_key anyway

# SFTP = RemoteProvider(port=22, username="myusername", private_key="/path/to/.ssh/id_rsa")
SFTP = RemoteProvider(port=22, username="myusername")

configfile : "config.json"

localrules: copyBclLocalToCluster

rule all:
    input:
        "copycluster.txt"

rule copyBclLocalToCluster:
    input:
        SFTP.remote("adress:path/to/filelocal.txt")
    output:
        "copycluster.txt"
    shell:
        "scp {input} {output}"

-----------------------------------------
Building DAG of jobs...
MissingInputException in line 26 of /path/to/Snakefile:
Missing input files for rule copyBclLocalToCluster:
adress:path/to/filelocal.txt

https://snakemake.readthedocs.io/en/stable/snakefiles/remote_files.html
The remote file addresses used must be specified with the host (domain or IP address) and the absolute path to the file on the remote server. A port may be specified if the SSH daemon on the server is listening on a port other than 22, in either the RemoteProvider or in each instance of remote():

The doc says that the port shouldn't be port 22, but why ? I really would like to use it since I don't know how to configure another port and I'm not even sure to have the rights to do it.

Is it really a port issue ? Or I just don't understand how to use SFTP with Snakemake.

What is the best way to use a file on my local machine as input of my snakemake ?


EDIT

It is not the port the problem, I don't even need to specify it because it is port 22.
I tried to specify the good ssh private key :

SFTP = RemoteProvider(port=22, username="myusername", private_key="/path/to/.ssh/id_rsa")
-----------------------------
Building DAG of jobs...
MissingInputException in line 26 of /path/to/Snakefile:
Missing input files for rule copyBclLocalToCluster:
adress:path/to/filelocal.txt

If I try sftp myusername@adress:path/to/filelocal.txt . on my console on the server it works fine.

Why it doesn't work inside snakemake ?


EDIT

When I try to use my password instead of ssh-key in remoteProvider I have the same error.

SFTP = RemoteProvider(port=22, username="myusername", password="mypassword")
--------------------------------
Building DAG of jobs...
MissingInputException in line 26 of /path/to/Snakefile:
Missing input files for rule copyBclLocalToCluster:
adress:path/to/filelocal.txt

I am sure the adress, username, password, ssh-key are correct and file exist, I can do it outside snakemake it works fine.


EDIT

Since RemoteProvider uses pysftp , I tried to copy the same file with pysftp in a python script.

import pysftp
with pysftp.Connection(adress, 
                       username="myusername",
                       private_key_pass="/path/to/.ssh/id_rsa") as sftp:
    sftp.get(path/to/filelocal.txt, /path/on/cluster/fileCOPY.txt)

It works fine, so the problem come from my Snakefile for sure.


EDIT

RemoteProvider also need ftputil , I tried ftputil in a python script.

import ftputil
with ftputil.FTPHost("adress", "myusername", "mypassword") as ftp_host:
    print(getcwd())
    ftp_host.download(remote_path, local_path)
----------------------------------------------
Traceback (most recent call last):
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/site-packages/ftputil/host.py", line 129, in _make_session
    return factory(*args, **kwargs)
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/ftplib.py", line 117, in __init__
    self.connect(host)
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/ftplib.py", line 152, in connect
    source_address=self.source_address)
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/socket.py", line 724, in create_connection
    raise err
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/socket.py", line 713, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "sftptest.py", line 16, in <module>
    with ftputil.FTPHost("adress", "myusername", "mypassword") as ftp_host:
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/site-packages/ftputil/host.py", line 69, in __init__
    self._session = self._make_session()
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/site-packages/ftputil/host.py", line 129, in _make_session
    return factory(*args, **kwargs)
  File "/work/username/miniconda3/envs/RNAseq_snakemake/lib/python3.6/site-packages/ftputil/error.py", line 146, in __exit__
    raise FTPOSError(*exc_value.args, original_exception=exc_value)
ftputil.error.FTPOSError: [Errno 111] Connection refused
Debugging info: ftputil 3.2, Python 3.6.7 (linux)

Could it be a problem ? But I don't have this kind of error in snakemake, just missing file error. I don't understand why ftputil is not working.

So I did a quick & dirty search in the source code of Snakemake, and Snakemake makes use of ftputil which requires a username and password. When you do not provide a ssh-key path to Snakemake this password will default to None , which then gets passed to ftputil .

See Snakemake source .

I agree that the default behaviour should default to something more sensible like ~/.ssh/id_rsa , but unfortunately it doesn't.

When you use SFTP on console you have to write

sftp myusername@adress:/path/to/file .

But in remote function of Snakemake you should delete the ":" between host and the path of the file.
I was mislead by the SFTP syntax but it was well written in the snakemake doc

# example from snakemake doc
SFTP.remote("example.com/path/to/file.bam")

# what I was doing badly
SFTP.remote("adress:path/to/filelocal.txt")

The right command is :

from snakemake.remote.SFTP import RemoteProvider
SFTP = RemoteProvider(port=22, username="myusername", password="mypassword")
rule all:
    input:
        # "/" instead of ":" between host and the path of the file
        SFTP.remote("adress/path/to/filelocal.txt") 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM