簡體   English   中英

Ansible 無法通過 SSH 連接到遠程主機:“控制路徑不存在”和“mux_client_read_packet:讀取 header 失敗:管道損壞”

[英]Ansible unable to connect to remote host over SSH: “control path does not exist” and “mux_client_read_packet: read header failed: Broken pipe”

我正在嘗試通過 SSH 連接到新配置的 EC2 實例。 但是,Ansible 總是無法 SSH 進入遠程機器。

不幸的是,我已經用盡了我能想到的關於這個問題的所有資源/選項。 我已經嘗試過:人們在類似問題上建議的不同配置; 使用不同版本的 Ansible; 重新啟動我的機器; 審查我的文件一百萬次以確保沒有錯別字。

這是劇本片段:

---
- name: Configure EC2 instance
  hosts: "localhost"
  connection: "local"
  gather_facts: no
  vars:
    REGION: "us-east-2"
    ...
  vars_files:
    - secrets.yml

  tasks:
  - name: Provision EC2 instance
    ec2:
    aws_access_key: "{{ AWS_ACCESS_KEY_ID }}"
    aws_secret_key: "{{ AWS_SECRET_ACCESS_KEY }}"
    ...     
    register: ec2

  - name: Wait for SSH to come up 
    wait_for_connection: 
    delay: 10
    timeout: 120
    loop: "{{ ec2.instances }}"

  - name: Add new instance public DNS to host group 
    add_host: 
    hostname: "{{ ec2.instances[0].public_dns_name }}"
    groups: "ec2"

 - name: SSH into EC2
   hosts: "ec2"
   connection: "ssh"
   remote_user: "ubuntu"
   gather_facts: yes

   tasks:
   - name: Wait for user data script to complete execution
     wait_for:
     path: /var/log/cloud-init-output.log
     search_regex: AMI BUILD COMPLETE
     delay: 15
     timeout: 120
...

我的/etc/ansible/ansible.cfg文件:

[defaults]
host_key_checking = False
private_key_file = /Users/dev/Projects/aws/keys/private-key.pem
stdout_callback = debug
log_path = /var/log/ansible/ansible.log

[ssh_connection]
transfer_method = scp
ssh_args = -C -o ControlMaster=auto -o ControlPersist=200 -o ConnectTimeout=30 -o ServerAliveInterval=50
scp_if_ssh = True

[persistent_connection]
connect_timeout = 300

執行上述劇本的命令:

sudo ANSIBLE_DEBUG=1 ansible-playbook infra/aws/ansible/ec2-provisioning.yml -vvvvv --ask-vault-pass

該錯誤發生在“SSH 到 EC2”播放中,特別是在收集事實部分。 這是該播放/任務的整個日志塊:

2020-06-05 12:16:43,795 p=root u=9572 | PLAY [SSH into EC2] ****************************************************************************************************************************************************
2020-06-05 12:16:43,805 p=root u=9572 | TASK [Gathering Facts] *************************************************************************************************************************************************
2020-06-05 12:16:43,817 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> ESTABLISH SSH CONNECTION FOR USER: ubuntu
2020-06-05 12:16:43,818 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ansible.cfg set ssh_args: (-C)(-o)(ControlMaster=auto)(-o)(ControlPersist=200)(-o)(ConnectTimeout=30)(-o)(ServerAliveInterval=50)
2020-06-05 12:16:43,818 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
2020-06-05 12:16:43,819 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_PRIVATE_KEY_FILE/private_key_file/ansible_ssh_private_key_file set: (-o)(IdentityFile="/Users/dev/Projects/aws/keys/private-key.pem")
2020-06-05 12:16:43,819 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ansible_password/ansible_ssh_password not set: (-o)(KbdInteractiveAuthentication=no)(-o)(PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey)(-o)(PasswordAuthentication=no)
2020-06-05 12:16:43,820 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User="ubuntu")
2020-06-05 12:16:43,820 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
2020-06-05 12:16:43,820 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: PlayContext set ssh_common_args: ()
2020-06-05 12:16:43,821 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: PlayContext set ssh_extra_args: ()
2020-06-05 12:16:43,822 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath=/Users/dev/.ansible/cp/4a306014bf)
2020-06-05 12:16:43,822 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=200 -o ConnectTimeout=30 -o ServerAliveInterval=50 -o StrictHostKeyChecking=no -o 'IdentityFile="/Users/dev/Projects/aws/keys/private-key.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=10 -o ControlPath=/Users/dev/.ansible/cp/4a306014bf ec2-3-23-59-101.us-east-2.compute.amazonaws.com '/bin/sh -c '"'"'echo ~ubuntu && sleep 0'"'"''
2020-06-05 12:16:45,132 p=root u=9623 | <ec2-3-23-59-101.us-east-2.compute.amazonaws.com> (255, b'', b'OpenSSH_8.1p1, LibreSSL 2.7.3\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 47: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug1: Control socket "/Users/dev/.ansible/cp/4a306014bf" does not exist\r\ndebug2: resolving "ec2-3-23-59-101.us-east-2.compute.amazonaws.com" port 22\r\ndebug2: ssh_connect_direct\r\ndebug1: Connecting to ec2-3-23-59-101.us-east-2.compute.amazonaws.com [3.23.59.101] port 22.\r\ndebug2: fd 5 setting O_NONBLOCK\r\ndebug1: connect to address 3.23.59.101 port 22: Connection refused\r\nssh: connect to host ec2-3-23-59-101.us-east-2.compute.amazonaws.com port 22: Connection refused\r\n')
2020-06-05 12:16:45,137 p=root u=9572 | fatal: [ec2-3-23-59-101.us-east-2.compute.amazonaws.com]: UNREACHABLE! => {
    "changed": false,
    "unreachable": true
}

MSG:

Failed to connect to the host via ssh: OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/dev/.ansible/cp/4a306014bf" does not exist
debug2: resolving "ec2-3-23-59-101.us-east-2.compute.amazonaws.com" port 22
debug2: ssh_connect_direct
debug1: Connecting to ec2-3-23-59-101.us-east-2.compute.amazonaws.com [3.23.59.101] port 22.
debug2: fd 5 setting O_NONBLOCK
debug1: connect to address 3.23.59.101 port 22: Connection refused
ssh: connect to host ec2-3-23-59-101.us-east-2.compute.amazonaws.com port 22: Connection refused

2020-06-05 12:16:45,139 p=root u=9572 | PLAY RECAP *************************************************************************************************************************************************************
2020-06-05 12:16:45,140 p=root u=9572 | ec2-3-23-59-101.us-east-2.compute.amazonaws.com : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
2020-06-05 12:16:45,140 p=root u=9572 | localhost                  : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

對我來說突出的部分是: Control socket "/Users/dev/.ansible/cp/4a306014bf" does not exist

我可以看到 Ansible 正在嘗試執行的命令,即:

ssh -vvv -C -o ControlMaster=auto -o ControlPersist=200 -o ConnectTimeout=30 -o ServerAliveInterval=50 -o StrictHostKeyChecking=no -o 'IdentityFile="/Users/dev/Projects/aws/keys/private-key.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=10 -o ControlPath=/Users/dev/.ansible/cp/4a306014bf ec2-3-23-59-101.us-east-2.compute.amazonaws.com '/bin/sh -c '"'"'echo ~ubuntu && sleep 0'"'"''

如果我使用 SSH 調試直接從命令行運行此程序,則 SSH 將返回:

...
debug3: send packet: type 97
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug3: mux_master_session_cleanup_cb: entering for channel 2
debug2: channel 1: rcvd close
debug2: channel 1: output open -> drain
debug2: channel 1: chan_shutdown_read (i0 o1 sock 3 wfd 3 efd -1 [closed])
debug2: channel 1: input open -> closed
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: client-session, nchannels 3
debug3: channel 2: status: The following connections are open:
#1 mux-control (t16 nr0 i3/0 o1/16 e[closed]/0 fd 3/3/-1 sock 3 cc -1)
#2 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/9 sock -1 cc -1)

debug2: channel 1: obuf empty
debug2: channel 1: chan_shutdown_write (i3 o1 sock 3 wfd 3 efd -1 [closed])
debug2: channel 1: output drain -> closed
debug2: channel 1: is dead (local)
debug2: channel 1: gc: notify user
debug3: mux_master_control_cleanup_cb: entering for channel 1
debug2: channel 1: gc: user detached
debug2: channel 1: is dead (local)
debug2: channel 1: garbage collecting
debug1: channel 1: free: mux-control, nchannels 2
debug3: channel 1: status: The following connections are open:
#1 mux-control (t16 nr0 i3/0 o3/0 e[closed]/0 fd 3/3/-1 sock 3 cc -1)

debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 0
debug2: set_control_persist_exit_time: schedule exit in 200 seconds

再次突出的部分是debug3: mux_client_read_packet: read header failed: Broken pipe行。 如果我從命令末尾刪除'/bin/sh -c '"'"'echo ~ubuntu && sleep 0'"'"''部分並從命令行再次運行它,它會成功連接。 不幸的是,我不能告訴 Ansible 刪除該部分命令。

非常感謝大家的幫助。 我真的不知道下一步會是什么堅實的基礎。 我很感激任何建議/想法。

我永遠無法找出這個問題的根本原因(如果我這樣做了,我會更新),但是在 SSH 播放之前添加一個短暫的等待是目前可用的解決方法:

  - name: Hard wait (30 seconds) before SSHing into EC2 instance
    pause:
      seconds: 30

它現在每次都有效。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM