私有子網中的 EC2 實例無法訪問亞馬遜存儲庫

Question

我正在嘗試創建 ECS 集群，並且我已經手動構建了具有 3 個公共子網和 3 個私有子網的 VPC。 所有 3 個公共子網的 IGW 都連接到 0.0.0.0/0，所有 3 個私有子網都連接到路由表中的 NAT 網關，地址為 0.0.0.0/0。 3 個 NAT 網關分別位於每個公共子網中。

我已經使用我現在嘗試使用的相同 CloudFormation 模板創建了另一個 ECS 集群，並且一切正常。

我比較了第 1 個和第 2 個 VPC（失敗的一個）之間的設置，並且所有設置（IGW、NAT 網關、路由表、NACL、SG）都相同，當然 IP 被調整為第二個 VPC 的 IP。 當我嘗試在第二個 VPC（失敗的一個）中創建 ECS 時，私有子網中的 EC2 實例無法連接到 Amazon 存儲庫，隨后整個堆棧都回滾，因為來自 EC2 實例的信號從未發送到 Auto Scaling 組。

之后我檢查了 EC2 實例的系統日志，但他們無法安裝亞馬遜代理。 以下是日志摘錄：

Starting cloud-init: Cloud-init v. 0.7.6 running 'modules:config' at Mon, 20 Aug 2018 06:38:04 +0000. Up 10.06 seconds.
Loaded plugins: priorities, update-motd, upgrade-helper


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Disable the repository, so yum won't use it by default. Yum will then
        just ignore the repository until you permanently enable it again or use
        --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>

     4. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn-main/latest
Could not retrieve mirrorlist http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list error was
12: Timeout on http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list: (28, 'Connection timed out after 5001 milliseconds')
Aug 20 06:38:20 cloud-init[2116]: util.py[WARNING]: Package upgrade failed
Aug 20 06:38:20 cloud-init[2116]: cc_package_update_upgrade_install.py[WARNING]: 1 failed with exceptions, re-raising the last one
Aug 20 06:38:20 cloud-init[2116]: util.py[WARNING]: Running module package-update-upgrade-install (<module 'cloudinit.config.cc_package_update_upgrade_install' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.pyc'>) failed
Generating SSH2 ED25519 host key: [  OK  ]

Starting sshd: [  OK  ]

ntpdate: Synchronizing with time server: [  OK  ]

Starting ntpd: [  OK  ]

Starting sendmail: [  OK  ]

Starting sm-client: [  OK  ]

Starting crond: [  OK  ]

Starting cgconfig service: [  OK  ]

Starting docker:    .[  OK  ]

Starting cloud-init: Cloud-init v. 0.7.6 running 'modules:final' at Mon, 20 Aug 2018 06:38:25 +0000. Up 29.91 seconds.
Loaded plugins: priorities, update-motd, upgrade-helper
Examining /var/tmp/yum-root-i85tqq/amazon-ssm-agent.rpm: amazon-ssm-agent-2.3.13.0-1.x86_64
Marking /var/tmp/yum-root-i85tqq/amazon-ssm-agent.rpm to be installed
Resolving Dependencies


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Disable the repository, so yum won't use it by default. Yum will then
        just ignore the repository until you permanently enable it again or use
        --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>

     4. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn-main/latest
Could not retrieve mirrorlist http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list error was
12: Timeout on http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list: (28, 'Connection timed out after 5000 milliseconds')
Loaded plugins: priorities, update-motd, upgrade-helper
[   53.291581] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[   53.297948] Bridge firewalling registered
[   53.304776] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[   53.318481] ip_tables: (C) 2000-2006 Netfilter Core Team
[   53.510300] Initializing XFRM netlink socket
[   53.515251] Netfilter messages via NETLINK v0.30.
[   53.518920] ctnetlink v0.93: registering with nfnetlink.
[   53.688086] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Disable the repository, so yum won't use it by default. Yum will then
        just ignore the repository until you permanently enable it again or use
        --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>

     4. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn-main/latest
Could not retrieve mirrorlist http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list error was
12: Timeout on http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list: (28, 'Connection timed out after 5000 milliseconds')
Loaded plugins: priorities, update-motd, upgrade-helper


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Disable the repository, so yum won't use it by default. Yum will then
        just ignore the repository until you permanently enable it again or use
        --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>

     4. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn-main/latest
Could not retrieve mirrorlist http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list error was
12: Timeout on http://repo.eu-central-1.amazonaws.com/latest/main/mirror.list: (28, 'Connection timed out after 5001 milliseconds')
/var/lib/cloud/instance/scripts/part-001: line 9: /opt/aws/bin/cfn-init: No such file or directory
/var/lib/cloud/instance/scripts/part-001: line 10: /opt/aws/bin/cfn-signal: No such file or directory
Aug 20 06:39:13 cloud-init[2286]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [127]
Aug 20 06:39:13 cloud-init[2286]: cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Aug 20 06:39:13 cloud-init[2286]: util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_user.pyc'>) failed

我已經檢查過 NACL，對於入站和出站，一切都設置為 0.0.0.0/0 和 ALLOW。

對於第一個 VPC，我使用 ECS 優化的 AMI 和t2.large （沒有任何問題）和第二個c5.xlarge （導致問題）。

什么可能仍然導致 EC2 無法訪問 Amazon 存儲庫？

編輯

所以后來我發現第二個 VPC 附加了 S3 端點。 經過更多的研究，我在 LinkedIn 上發現了一篇很好的帖子，說明：

Amazon Linux 存儲庫托管在 S3 上，因此有必要允許在 S3 終端節點策略中訪問它。

因此，當您啟動 yum 時，它會使用本地 DNS 詭計的魔法來路由到內部 S3 端點

我繼續更新我的 CloudFormation 模板並向下面的 LaunchConfiguration 添加了額外的策略，但這並沒有幫助：

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": [
                "arn:aws:s3:::repo.eu-central-1.amazonaws.com",
                "arn:aws:s3:::repo.eu-central-1.amazonaws.com/*"
            ],
            "Effect": "Allow"
        }
    ]
}

端點策略如下所示：

{
    "Statement": [
        {
            "Action": "*",
            "Effect": "Allow",
            "Resource": "*",
            "Principal": "*"
        }
    ]
}

Answer 1

所以最后在探索了 AWS 控制台的所有部分后，我找到了導致問題的原因。 正如我在對原始帖子的更新中所述，當 Endpoint 附加到 VPC 時，EC2 將嘗試在內部解析包和存儲庫。 我去檢查了端點的每個設置，發現只有公共子網的路由表添加到端點，並且在我添加了私有子網之后，EC2 實例可以訪問包和存儲庫。

私有子網中的 EC2 實例無法訪問亞馬遜存儲庫

問題描述

編輯

1 個解決方案

解決方案1
1 已采納 2018-08-20 13:00:47

私有子網中的 EC2 實例無法訪問亞馬遜存儲庫

問題描述

編輯

1 個解決方案

解決方案1 1 已采納 2018-08-20 13:00:47

解決方案1
1 已采納 2018-08-20 13:00:47