简体   繁体   English

为什么在调整根 EBS 卷大小后 EC2 实例无法正确启动?

[英]Why does EC2 instance not start correctly after resizing root EBS volume?

I was using the instructions on https://matt.berther.io/2015/02/03/how-to-resize-aws-ec2-ebs-volumes/ and http://atodorov.org/blog/2014/02/07/aws-tip-shrinking-ebs-root-volume-size/ to move to a EBS volume with less disk space.我正在使用https://matt.berther.io/2015/02/03/how-to-resize-aws-ec2-ebs-volumes/http://atodorov.org/blog/2014/02上的说明/07/aws-tip-shrinking-ebs-root-volume-size/移动到磁盘空间较少的 EBS 卷。 In both cases, when I attached the shrinked EBS volume(as /dev/xdva or /dev/sda1 , neither works) to an EC2 instance and start it, it stops on its own with the message在这两种情况下,当我将缩小的 EBS 卷(如 /dev/xdva 或 /dev/sda1 ,都不起作用)附加到 EC2 实例并启动它时,它会自行停止并显示消息

State transition reason
Client.InstanceInitiatedShutdown: Instance initiated shutdown

Some more tinkering and I found that the new volume did not have BIOS boot partition.再修修补补,我发现新卷没有 BIOS 启动分区。 So I used gdisk to make one and copied the MBR from the original volume(that works and using which I can start instances) to the new volume.因此,我使用 gdisk 制作了一个并将 MBR 从原始卷(有效并使用它我可以启动实例)复制到新卷。 Now the instance does not terminate but I am not able to ssh into the newly launched instance.现在实例不会终止,但我无法通过 ssh 进入新启动的实例。

What might be the reason behind this happening?发生这种情况的原因可能是什么? How can I get more information(from logs/AWS Console etc) on why this is happening?我怎样才能获得更多信息(从日志/AWS 控制台等)为什么会发生这种情况?

To shrink a GPT partioned boot EBS volume below the 8GB that standard images seem to use you can do the following: (a slight variation of the dd method from https://matt.berther.io/2015/02/03/how-to-resize-aws-ec2-ebs-volumes/ )要收缩GPT启动的8GB以下EBS卷进行分配的是标准的图像似乎使用,你可以做到以下几点:(的微小变化dd方法从https://matt.berther.io/2015/02/03/how- to-resize-aws-ec2-ebs-volumes/ )

source disk is /dev/xvdf , target is /dev/xvdg源磁盘是/dev/xvdf ,目标是/dev/xvdg

  1. Shrink source partition收缩源分区

    $ sudo e2fsck -f /dev/xvdf1 $ sudo resize2fs -M /dev/xvdf1

    Will print something like会打印类似的东西

    resize2fs 1.42.12 (29-Aug-2014) Resizing the filesystem on /dev/xvdf1 to 257491 (4k) blocks. The filesystem on /dev/xvdf1 is now 257491 (4k) blocks long.

    I converted this to MB, ie 257491 * 4 / 1024 ~= 1006 MB我将其转换为 MB,即 257491 * 4 / 1024 ~= 1006 MB

  2. copy above size + a bit more from device to device (!), not just partition to partition, because that includes both partition table & data in the boot partition从设备到设备复制以上大小 + 多一点(!),不仅仅是分区到分区,因为这包括启动分区中的分区表和数据

    $ sudo dd if=/dev/xvdf of=/dev/xvdg bs=1M count=1100
  3. now use gdisk to fix the GPT partition on the new disk现在使用gdisk修复新磁盘上的 GPT 分区

    $ sudo gdisk /dev/xvdg

    You'll be greeted with roughly你会被粗暴地打招呼

    GPT fdisk (gdisk) version 0.8.10 Warning! Disk size is smaller than the main header indicates! Loading secondary header from the last sector of the disk! You should use 'v' to verify disk integrity, and perhaps options on the experts' menu to repair the disk. Caution: invalid backup GPT header, but valid main header; regenerating backup header from main header.# Warning! One or more CRCs don't match. You should repair the disk! Partition table scan: MBR: protective BSD: not present APM: not present GPT: damaged **************************************************************************** Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk verification and recovery are STRONGLY recommended. **************************************************************************** Command (? for help):

    The following is the keyboard input within gdisk .以下是gdisk的键盘输入。 To fix the problems, the data partition that is present in the copied partition table needs to be resized to fit on the new disk.要解决这些问题,需要调整复制的分区表中存在的数据分区的大小以适应新磁盘。 This means it needs to be recreated smaller and it's properties need to be set to match the old partition definition.这意味着它需要重新创建得更小,并且需要设置它的属性以匹配旧的分区定义。 Didn't test it so it's maybe not required to relocate the backup table to the actual end of the disk but I did it anyways:没有对其进行测试,因此可能不需要将备份表重新定位到磁盘的实际末尾,但我还是这样做了:

    • go to extra expert options: x转到额外的专家选项: x
    • relocate backup data structures to the end of the disk: e将备份数据结构重定位到磁盘末尾: e
    • back to main menu: m返回主菜单: m

    Now to fixing the partition size现在修复分区大小

    • print and note some properties of partition 1 (and other non-boot partitions if they exist):打印并记下分区 1(以及其他非引导分区,如果存在)的一些属性:
      i
      1
      Will show something like会显示类似的东西

      Partition GUID code: 0FC63DAF-8483-4772-8E79-3D69D8477DE4 (Linux filesystem) Partition unique GUID: DBA66894-D218-4D7E-A33E-A9EC9BF045DB First sector: 4096 (at 2.0 MiB) Last sector: 16777182 (at 8.0 GiB) Partition size: 16773087 sectors (8.0 GiB) Attribute flags: 0000000000000000 Partition name: 'Linux'
    • now delete现在删除
      d
      1
      and recreate the partition并重新创建分区
      n
      1
      Enter the required parameters.输入所需的参数。 All defaults worked for me here (= press enter), when in doubt refer to partition information from above此处所有默认设置都对我有用(= 按 Enter),如有疑问,请参阅上面的分区信息

      • First sector = 4096第一个扇区 = 4096
      • Last sector = whatever is the actual end of the new disk - take the default here最后一个扇区 = 新磁盘的实际结束位置 - 此处采用默认值
      • type = 8300 (Linux)类型 = 8300 (Linux)
    • The new partition's default name did not match the old one.新分区的默认名称与旧分区不匹配。 So change it to the original One所以把它改成原来的一
      c
      1
      Linux (see Partition name from above) Linux (参见上面的Partition name

    • Next thing to change is the partition's GUID接下来要更改的是分区的 GUID
      x
      c
      1
      DBA66894-D218-4D7E-A33E-A9EC9BF045DB (see Partition unique GUID , not the partition guid code above that) DBA66894-D218-4D7E-A33E-A9EC9BF045DB (请参阅Partition unique GUID ,而不是上面的分区 guid 代码)
    • That should be it.应该是这样。 Back to main menu & print state返回主菜单和打印状态
      m
      i
      1
      Will now print现在将打印

      Partition GUID code: 0FC63DAF-8483-4772-8E79-3D69D8477DE4 (Linux filesystem) Partition unique GUID: DBA66894-D218-4D7E-A33E-A9EC9BF045DB First sector: 4096 (at 2.0 MiB) Last sector: 8388574 (at 4.0 GiB) Partition size: 8384479 sectors (4.0 GiB) Attribute flags: 0000000000000000 Partition name: 'Linux'

      The only change should be the Partition size .唯一的变化应该是Partition size

    • write to disk and exit写入磁盘并退出
      w
      y
  4. grow filesystem to match entire (smaller) disk.增长文件系统以匹配整个(较小的)磁盘。 The fist step shrunk it down to the minimal size it can fit第一步将它缩小到可以容纳的最小尺寸

    $ sudo resize2fs -p /dev/xvdg1
  5. We're done.我们完成了。 Detach volume & snapshot it.分离卷并对其进行快照。

  6. Optional step.可选步骤。 Choosing proper Kernel ID for the AMI.为 AMI 选择正确的内核 ID。

If you are dealing with PVM image and encounter following mount error in instance logs如果您正在处理 PVM 映像并在实例日志中遇到以下挂载错误

Kernel panic - not syncing: VFS: Unable to mount root内核恐慌 - 不同步:VFS:无法挂载 root

when your instance doesn't pass startup checks, you may probably be required to perform this additional step.当您的实例未通过启动检查时,您可能需要执行此附加步骤。

The solution to this error would be to choose proper Kernel ID for your PVM image during image creation from your snapshot.此错误的解决方案是在从快照创建映像期间为 PVM 映像选择正确的内核 ID。 The full list of Kernel IDs (AKIs) can be obtained here .可以在此处获取内核 ID (AKI) 的完整列表。

Do choose proper AKI for your image, they are restricted by regions and architectures!请为您的图像选择合适的 AKI,它们受区域和架构的限制!

The problem was with the BIOS boot partition.问题出在 BIOS 启动分区上。 I was able to solve this by first initializing an instance with a smaller EBS volume.我能够通过首先用较小的 EBS 卷初始化一个实例来解决这个问题。 Then detaching the volume and attaching it to an instance whihc will be used to copyt the contents fromt he larger volume o the smaller volume.然后分离卷并将其附加到一个实例,该实例将用于从较大的卷或较小的卷中复制内容。 That created a BIOS boot partition which actually works.这创建了一个实际工作的 BIOS 引导分区。 Simply creating a new one and copying the boot partition does not work.简单地创建一个新分区并复制引导分区是行不通的。

Now following the steps outlined in any of the two links will help one shrink the volume of root EBS.现在按照两个链接中的任何一个中列出的步骤操作将有助于缩小根 EBS 的数量。

Today, using UBUNTU doesn't work any other solution here.今天,在这里使用UBUNTU没有任何其他解决方案。 However, I found it:但是,我发现它:

  1. For caution: snapshot the large volume (backup)小心:快照大卷(备份)
  2. CREATE an instance IDENTICAL as possible such as LARGE volume works well.创建一个尽可能相同的实例,例如LARGE 卷效果很好。 BUT with a SMALLER volume (desired size)但体积更小(所需尺寸)
  3. Detach his new volume and ATTACH the large volume (as /dev/sda1 ) and START instance分离他的新卷并附加大卷(作为/dev/sda1 )和START实例
  4. ATTACH the smaller new volume as /dev/sdf较小的新卷附加为/dev/sdf
  5. LOG IN in new instance.在新实例中登录 Mount smaller volume on /mnt: sudo mount -t ext4 /dev/xvdf1 /mnt在 /mnt 上挂载较小的卷: sudo mount -t ext4 /dev/xvdf1 /mnt
  6. DELETE everything on /mnt.删除 /mnt 上的所有内容。 Ignore WARNNG/error since /mnt can't be deleted :) with sudo rm -rf /mnt忽略警告/错误,因为 /mnt 无法删除 :) 使用sudo rm -rf /mnt
  7. Copy entire / to smaller volume: sudo cp -ax / /mnt将整个 / 复制到较小的卷: sudo cp -ax //mnt
  8. Exit from instance and Stop it in AWS console从实例退出并在 AWS 控制台中停止它
  9. Detach BOTH volumes.分离两个卷。 Now, re-attach the smaller volume, IMPORTANT, as /dev/sda1现在,重新附加较小的卷,重要的是,作为/dev/sda1
  10. Start instance.启动实例。 LOG IN instance and confirm everything is ok with smaller volume登录实例并确认一切正常,音量较小
  11. Delete large volume, delete large snapshot, create a new snapshot of smaller volume.删除大卷,删除大快照,创建小卷的新快照。 END .结束

The above procedures are not complete, missing steps:以上程序不完整,缺少步骤:

  1. Copy disk UUID复制磁盘 UUID
  2. Install grub boot loader安装 grub 引导加载程序
  3. Copy label复制标签

A more complete procedure can be found here:更完整的过程可以在这里找到:

https://medium.com/@m.yunan.helmy/decrease-the-size-of-ebs-volume-in-your-ec2-instance-ea326e951bce https://medium.com/@m.yunan.helmy/decrease-the-size-of-ebs-volume-in-your-ec2-instance-ea326e951bce

This procedure is faster and more simple (no dd/resize2fs only rsync).这个过程更快更简单(没有dd/resize2fs,只有rsync)。

Tested with newer Nvme AWS disks.使用较新的 Nvme AWS 磁盘进行测试。

Post any questions if you need help如果您需要帮助,请发表任何问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM