简体   繁体   English

VM重启后无法SSH Google Cloud VM实例

[英]Cannot SSH Google Cloud VM instance after VM restart

I am using Google Cloud Platform and connect to my VM instance through the Google Cloud Console. 我正在使用Google Cloud Platform并通过Google Cloud Console连接到我的VM实例。 Restarted the VM without reserving static IP therefore upon VM restart the ephemeral IP changed. 重新启动VM而不保留静态IP因此在VM重新启动时,临时IP已更改。 The reason I restarted the VM was because I noticed the CPU utilization was at constant 100% which I figured was not the CPU of my local VM instance (Ubuntu 16.x) but the Google shared container CPU utilization. 我重新启动VM的原因是因为我注意到CPU利用率一直是100%,我认为这不是我本地VM实例(Ubuntu 16.x)的CPU,而是Google共享容器CPU利用率。 But it was not allowing me to SSH in to my VM instance so I thought a restart might help. 但它不允许我SSH到我的VM实例,所以我认为重启可能会有所帮助。

VM restart did help but the IP changed :( I run Apache and Nginx servers so I had to manually update the new IP in the respective configuration files in order for my apps to run. Since the VM restart I have been experiencing trouble connecting to VM instance via SSH. VM重启确实有所帮助,但IP改变了:(我运行Apache和Nginx服务器,所以我不得不手动更新相应配置文件中的新IP,以便我的应用程序运行。由于VM重启,我一直遇到连接到VM的问题实例通过SSH。

Firewall rules - OK (set to allow port 22) .ssh/sshd_conf - OK (RSAauth yes) GCE VM SSH Key - OK (public key for user is saved) 防火墙规则 - 确定(设置为允许端口22).ssh / sshd_conf - 确定(RSAauth是)GCE VM SSH密钥 - 确定(保存用户的公钥)

I tried the following steps to resolve the issue but in vain 我尝试了以下步骤来解决问题,但徒劳无功

  1. Removed SSH key pairs from metadata and SSH keys and regenerated new public key using puttyGen 从元数据和SSH密钥中删除了SSH密钥对,并使用puttyGen重新生成了新的公钥
  2. Verified key formatting of puttyGen and ensured the accurate public key was saved in the Google VM instance SSH keys section 已验证puttyGen的密钥格式并确保准确的公钥已保存在Google VM实例SSH密钥部分中
  3. When I noticed that /etc/ssh/authorized_keys was empty I reinitialized using gcloud init which took care of the oAuth part but this did not resolve the issue 当我注意到/etc/ssh/authorized_keys为空时,我使用gcloud init重新gcloud init ,它负责处理oAuth部分,但这并没有解决问题
  4. I tried the gcloud command on my local Google Cloud SDK shell but it keeps throwing the error server refused key 我在我的本地Google Cloud SDK shell上尝试了gcloud command ,但它不断抛出错误server refused key

Finally, here the tracelog from /var/log/syslog 最后,这里是来自/var/log/syslog

Sep 25 22:30:01 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 CRON[1746]: (root) CMD (/google/scripts/gcloud_docker_auth.sh)
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Proxying devshell request, attempt (1 of 3)
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Connecting to DEVSHELL_CLIENT_PORT 40159
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:writing to devshell 4 bytes
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:read from devshell 293 bytes
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Closing devshell forwarding connection.
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Closing client connection.
Sep 25 22:35:01 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 CRON[1774]: (root) CMD (/google/scripts/gcloud_docker_auth.sh)
Sep 25 22:37:10 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:saw no newline in the first 6 bytes Retrying...(1$
Sep 25 22:37:14 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service ERROR:root:Error, could not connect to devshell. Giving up.
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service Traceback (most recent call last):
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service   File "/google/credentials/control_server.py", line 110, i$
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service     self.hanging_socket.connect(('localhost', self.server_p$
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service   File "/usr/lib/python2.7/socket.py", line 224, in meth
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service     return getattr(self._sock,name)(*args)
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service error: [Errno 111] Connection refused
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: 2017-09-25 22:37:22,640 INFO exited: control-command-service (exit status 0; expect$
Sep 25 22:37:23 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: 2017-09-25 22:37:23,642 INFO spawned: 'control-command-service' with pid 1801
Sep 25 22:37:23 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:23 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:24 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: 2017-09-25 22:37:24,705 INFO success: control-command-service entered RUNNING state$
Sep 25 22:37:27 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:27 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:34 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Executing health check.

I had similar issue, and in my case the root cause was that my VM was that configured SSH keys do not survive restart of VM (they disappear from VM configuration once the VM instance is restarted). 我有类似的问题,在我的情况下,根本原因是我的VM是配置的SSH密钥不能在VM重启后继续存在(一旦VM实例重新启动它们就会从VM配置中消失)。

Not quite sure what is the real reason for this, but my humble theory is that SSH kyes are by default stored directly on boot disk (not on persistent volume), and in my case VM has been configured with Delete boot disk when instance is deleted opetion enabled, I'm guessing this feature has been somehow triggered after the restart, meaning that SSH keys has been lost with deletion of boot disk. 不太确定这是什么原因,但我的谦虚理论是默认情况下SSH kyes直接存储在启动盘上(而不是持久卷上),在我的情况下,VM已被Delete boot disk when instance is deleted配置了Delete boot disk when instance is deleted启用操作,我猜这个功能在重启后以某种方式被触发,这意味着删除了启动盘后SSH密钥丢失了。

Your issue may be the Guest environment. 您的问题可能是来宾环境。

  1. Go to the VM instances page in Google Cloud Platform console. 转到Google Cloud Platform控制台中的VM实例页面。
  2. Click on the instance for which you want to add a startup script. 单击要添加启动脚本的实例。
  3. Click the Edit button at the top of the page. 单击页面顶部的“编辑”按钮。
  4. Click on 'Enable connecting to serial ports' 单击“启用连接到串行端口”
  5. Under Custom metadata, click Add item. 在自定义元数据下,单击添加项。
  6. Set 'Key' to 'startup-script' and set 'Value' to this script: 将'Key'设置为'startup-script'并将'Value'设置为此脚本:

{#! {#! /bin/bash useradd -G sudo USERNAME echo 'USERNAME:PASSWORD' | / bin / bash useradd -G sudo USERNAME echo'USERNAME:PASSWORD'| chpasswd} chpasswd的}

  1. Click Save and then click RESET on the top of the page. 单击“保存”,然后单击页面顶部的“重置”。 You might need to wait for some time for the instance to reboot. 您可能需要等待一段时间才能重新启动实例。
  2. Click on 'Connect to serial port' in the page. 单击页面中的“连接到串行端口”。
  3. In the new window, you might need to wait a bit and press on Enter of your keyboard once; 在新窗口中,您可能需要稍等一会然后按Enter键一次; then, you should see the login prompt. 然后,您应该看到登录提示。 10.. Login using the USERNAME and PASSWORD you provided. 10 ..使用您提供的USERNAME和PASSWORD登录。

Then inside the instance you need to fetch which is not working by Validate the Guest Environment : 然后在实例内部,您需要通过验证访客环境来获取不起作用的实例:

First: look in your serial console if these line below are listed : 首先:如果列出以下这些行,请查看串行控制台:

  • Started Google Compute Engine Accounts Daemon 已启动Google Compute Engine帐户守护程序
  • Started Google Compute Engine IP Forwarding Daemon 启动了Google Compute Engine IP转发守护程序
  • Started Google Compute Engine Clock Skew Daemon 启动了Google Compute Engine Clock Skew Daemon
  • Started Google Compute Engine Instance Setup 启动了Google Compute Engine实例设置
  • Started Google Compute Engine Startup Scripts 已启动Google Compute Engine启动脚本
  • Started Google Compute Engine Shutdown Scripts 启动了Google Compute Engine关闭脚本
  • Started Google Compute Engine Network Setup 已启动Google Compute Engine网络设置

Second: Verify if the package for the guest Environment is installed run the command in your serial output 第二步:验证是否安装了guest虚拟机环境的软件包在串行输出中运行命令

apt list --installed | grep google-compute

It should list the below line : - google-compute-engine - google-compute-engine-oslogin - python-google-compute-engine - python3-google-compute-engine 它应列出以下行: - google-compute-engine - google-compute-engine-oslogin - python-google-compute-engine - python3-google-compute-engine

Third: you need to verify if all the services for the guest environment are running by running this command : 第三:您需要通过运行此命令来验证guest虚拟机环境的所有服务是否都在运行:

sudo systemctl list-unit-files | grep google | grep enabled

It should list the below line : 它应列出以下行:

  • google-accounts-daemon.service enabled 启用了google-accounts-daemon.service
  • google-ip-forwarding-daemon.service enabled 启用了google-ip-forwarding-daemon.service
  • google-clock-skew-daemon.service enabled 启用了google-clock-skew-daemon.service
  • google-instance-setup.service enabled google-instance-setup.service已启用
  • google-shutdown-scripts.service enabled google-shutdown-scripts.service已启用
  • google-startup-scripts.service enabled google-startup-scripts.service已启用
  • google-network-setup.service enabled google-network-setup.service已启用

If sometimes different according to above you may need to restart the service or installed the Guest environment 如果根据上述情况有时不同,则可能需要重新启动服务或安装Guest环境

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM