简体   繁体   中英

dask scheduler SSHCluster() not establishing connection

miniconda3 environment:

  • windows server 2019 vm
  • python 3.9.10
  • dask + distributed 2022.2.1
  • asyncssh 2.9.0

In the process of changing from 2021.xy dask baseline to latest version of dask. To do so, created new conda environment from scratch. Wrote a small test program to understand changes.

ssh (openssh 8.6.0p1) to self works. But, unable to connect scheduler on localhost using SSHCluster().

Few alternatives i've tried:

  • Manual invocation of dask-scheduler and dask-worker works.
(base) >dask-scheduler
distributed.scheduler - INFO - -----------------------------------------------
distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO - Clear task state
distributed.scheduler - INFO -   Scheduler at:    tcp://x.y.z.z:8786
distributed.scheduler - INFO -   dashboard at:                  :8787
  • Manual invocation of dask-ssh doesn't work. In distributed/cli/dask_ssh, it seems invocation of dask-ssh calls SSHCluster in old_ssh.py not the new one. So, perhaps the cli way doesn't really matter per se.
cmd: dask-ssh --scheduler localhost --scheduler-port 8786 --nthreads 1 --nworkers 2 --remote-python %USERPROFILE%\miniconda3\python.exe

[ [1mscheduler localhost:8786[0m ] : [H]0;c:\windows\system32\cmd.exe [?25h[?25l'$SHELL' is not recognized as an internal or external command,[18X[18C

it throws an error with traceback. But, for a windows systems, if i replace with

stdin, stdout, stderr = ssh.exec_command( "%SystemRoot%\System32\cmd.exe -c '" + cmd_dict["cmd"] + "'", get_pty=True ) then no errors are thrown. I reckon some folks over at distributed team might want to implement the if condition to have different commands based on OS type?

  • simple SSHCluster() invocation doesn't work.
[DEBUG] 2022-02-27 22:03:28,856 selector_events.py:59 Using selector: SelectSelector
[DEBUG] 2022-02-27 22:03:28,858 selector_events.py:59 Using selector: SelectSelector
[INFO] 2022-02-27 22:03:29,265 logging.py:92 Opening SSH connection to localhost, port 22
[INFO] 2022-02-27 22:03:29,297 logging.py:92 [conn=0] Connected to SSH server at localhost, port 22
[INFO] 2022-02-27 22:03:29,297 logging.py:92 [conn=0]   Local address: ::1, port 49571
[INFO] 2022-02-27 22:03:29,298 logging.py:92 [conn=0]   Peer address: ::1, port 22
[DEBUG] 2022-02-27 22:03:29,298 logging.py:92 [conn=0] Sending version SSH-2.0-AsyncSSH_2.9.0
[DEBUG] 2022-02-27 22:03:29,303 logging.py:92 [conn=0] Received version SSH-2.0-OpenSSH_for_Windows_8.6
[DEBUG] 2022-02-27 22:03:29,303 logging.py:92 [conn=0] Requesting key exchange
[DEBUG] 2022-02-27 22:03:29,330 logging.py:92 [conn=0] Received key exchange request
[DEBUG] 2022-02-27 22:03:29,331 logging.py:92 [conn=0] Beginning key exchange
[DEBUG] 2022-02-27 22:03:29,338 logging.py:92 [conn=0] Completed key exchange
[INFO] 2022-02-27 22:03:29,340 logging.py:92 [conn=0] Beginning auth for user maulik
[DEBUG] 2022-02-27 22:03:29,349 logging.py:92 [conn=0] Trying public key auth with ssh-ed25519 key
[DEBUG] 2022-02-27 22:03:29,351 logging.py:92 [conn=0] Signing request with ssh-ed25519 key
[INFO] 2022-02-27 22:03:29,363 logging.py:92 [conn=0] Auth for user maulik succeeded
[DEBUG] 2022-02-27 22:03:29,364 logging.py:92 [conn=0, chan=0] Set write buffer limits: low-water=16384, high-water=65536
[INFO] 2022-02-27 22:03:29,364 logging.py:92 [conn=0, chan=0] Requesting new SSH session
[DEBUG] 2022-02-27 22:03:29,409 logging.py:92 [conn=0] Received unknown global request: hostkeys-00@openssh.com
[INFO] 2022-02-27 22:03:29,410 logging.py:92 [conn=0, chan=0]   Command: uname
[INFO] 2022-02-27 22:03:29,451 logging.py:92 [conn=0, chan=0] Received exit status 1
[INFO] 2022-02-27 22:03:29,452 logging.py:92 [conn=0, chan=0] Received channel close
[INFO] 2022-02-27 22:03:29,453 logging.py:92 [conn=0, chan=0] Channel closed
[DEBUG] 2022-02-27 22:03:29,453 logging.py:92 [conn=0, chan=1] Set write buffer limits: low-water=16384, high-water=65536
[INFO] 2022-02-27 22:03:29,454 logging.py:92 [conn=0, chan=1] Requesting new SSH session
[INFO] 2022-02-27 22:03:29,455 logging.py:92 [conn=0, chan=1]   Command: cmd.exe /c ver
[INFO] 2022-02-27 22:03:29,502 logging.py:92 [conn=0, chan=1] Received exit status 0
[INFO] 2022-02-27 22:03:29,503 logging.py:92 [conn=0, chan=1] Received channel close
[INFO] 2022-02-27 22:03:29,504 logging.py:92 [conn=0, chan=1] Channel closed
[DEBUG] 2022-02-27 22:03:29,505 logging.py:92 [conn=0, chan=2] Set write buffer limits: low-water=16384, high-water=65536
[INFO] 2022-02-27 22:03:29,506 logging.py:92 [conn=0, chan=2] Requesting new SSH session
[INFO] 2022-02-27 22:03:29,507 logging.py:92 [conn=0, chan=2]   Command: set DASK_INTERNAL_INHERIT_CONFIG=*<XYZ>* && %USERPROFILE%\miniconda3\python.exe -m distributed.cli.dask_spec --spec '{"cls": "distributed.Scheduler", "opts": {"n_workers": 2, "nthreads": 1, "name": "worker", "worker_port": "12000:12500", "nanny_port": "12501:13000"}}'
distributed.deploy.ssh - INFO - Traceback (most recent call last):
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\runpy.py", line 197, in _run_module_as_main
distributed.deploy.ssh - INFO - return _run_code(code, main_globals, None,
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\runpy.py", line 87, in _run_code
distributed.deploy.ssh - INFO - exec(code, run_globals)
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\site-packages\distributed\cli\dask_spec.py", line 43, in <module>
distributed.deploy.ssh - INFO - main()
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\site-packages\click\core.py", line 1128, in __call__
distributed.deploy.ssh - INFO - return self.main(*args, **kwargs)
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\site-packages\click\core.py", line 1053, in main
distributed.deploy.ssh - INFO - rv = self.invoke(ctx)
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\site-packages\click\core.py", line 1395, in invoke
distributed.deploy.ssh - INFO - return ctx.invoke(self.callback, **ctx.params)
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\site-packages\click\core.py", line 754, in invoke
distributed.deploy.ssh - INFO - return __callback(*args, **kwargs)
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\site-packages\distributed\cli\dask_spec.py", line 27, in main
distributed.deploy.ssh - INFO - _spec.update(json.loads(spec))
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\json\__init__.py", line 346, in loads
distributed.deploy.ssh - INFO - return _default_decoder.decode(s)
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\json\decoder.py", line 337, in decode
distributed.deploy.ssh - INFO - obj, end = self.raw_decode(s, idx=_w(s, 0).end())
distributed.deploy.ssh - INFO - File "<bleh>\miniconda3\lib\json\decoder.py", line 355, in raw_decode
distributed.deploy.ssh - INFO - raise JSONDecodeError("Expecting value", s, err.value) from None
distributed.deploy.ssh - INFO - json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[INFO] 2022-02-27 22:03:30,364 logging.py:92 [conn=0, chan=2] Received exit status 1
[INFO] 2022-02-27 22:03:30,366 logging.py:92 [conn=0, chan=2] Received channel close
[INFO] 2022-02-27 22:03:30,369 logging.py:92 [conn=0, chan=2] Channel closed

relevant code for SSHCluster()

py_path = '%USERPROFILE%\miniconda3\python.exe'

co={"preferred_auth": "publickey","known_hosts": None}
so={"port": 8786}
wo={"n_workers": 2,"nthreads": 1,"name": "worker","worker_port": "12000:12500","nanny_port" : "12501:13000"}
deploy_ssh = SSHCluster(["localhost", "localhost"],co, so,wo,remote_python=py_path)

I seem to be having the same problem. I believe the root cause of this problem is that Dask expects 'cmd /c ver' to be valid, but your machine doesn't recognize the command (I'm using Windows 2019 Server).

There is an open issue (at the time of writing) about this here , which seems to address this issue and merge it into the master branch as of November 2021. However, I am using Dask 2022.2.1 and it hasn't seem to have made it into this release, or any release after .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM