简体   繁体   English

加载 AWS CloudFront 文件时出现 403(禁止)

[英]Getting 403 (Forbidden) when loading AWS CloudFront file

I'm working on a video app and storing the files on AWS S3, using the default URL like https://***.amazonaws.com/*** works fine but I have decided to use CloudFront which is faster for content delivery.我正在开发一个视频应用程序并将文件存储在 AWS S3 上,使用默认的 URL 像https://***.amazonaws.com/***工作正常,但我决定使用 CloudFront,它对内容来说更快送货。

Using CF, I keep getting 403 (Forbidden) using this URL https://***.cloudfront.net/*** .使用 CF,我使用这个 URL https://***.cloudfront.net/***不断得到403 (Forbidden) Did I miss anything?我错过了什么吗?

Everything works fine until I decide to load the contents from CloudFront which points to my bucket.一切正常,直到我决定从指向我的存储桶的 CloudFront 加载内容。

Any solution please?请问有什么解决办法吗?

When restricting access to S3 content using a bucket policy that inspects the incoming Referer: header, you need to do a little bit of custom configuration to "outsmart" CloudFront.当使用检查传入Referer:标头的存储桶策略限制对 S3 内容的访问时,您需要进行一些自定义配置以“智胜”CloudFront。

It's important to understand that CloudFront is designed to be a well-behaved cache.了解 CloudFront 旨在成为性能良好的缓存很重要。 By "well-behaved," I mean that CloudFront is designed to never return a response that differs from what the origin server would have returned. “行为良好”是指 CloudFront 设计为从不返回与源服务器返回的响应不同的响应。 I'm sure you can see that is an important factor.我相信你可以看到这是一个重要的因素。

Let's say I have a web server (not S3) behind CloudFront, and my web site is designed so that it returns different content based on an inspection of the Referer: header... or any other http request header, like User-Agent: for example.假设我在 CloudFront 后面有一个 Web 服务器(不是 S3),并且我的网站被设计为根据对Referer:标头...或任何其他 http 请求标头(如User-Agent:的检查返回不同的内容User-Agent:例如。 Depending on your browser, I might return different content.根据您的浏览器,我可能会返回不同的内容。 How would CloudFront know this, so that it would avoid serving a user the wrong version of a certain page? CloudFront 如何知道这一点,从而避免为用户提供某个页面的错误版本?

The answer is, it wouldn't be able to tell -- it can't know this.答案是,它无法分辨——它无法知道这一点。 So, CloudFront's solution is not to forward most request headers to my server at all.因此,CloudFront 的解决方案根本不是将大多数请求标头转发到我的服务器。 What my web server can't see, it can't react to, so the content I return cannot vary based on headers I don't receive, which prevents CloudFront from caching and returning the wrong response, based on those headers.我的 Web 服务器看不到的内容,它无法做出反应,因此我返回的内容不能根据我未收到的标头而变化,这可以防止 CloudFront 缓存并根据这些标头返回错误的响应。 Web caches have an obligation to avoid returning the wrong cached content for a given page. Web 缓存有义务避免为给定页面返回错误的缓存内容。

"But wait," you object. “但是等等,”你反对。 "My site depends on the value from a certain header in order to determine how to respond." “我的网站依赖于某个标头的值来确定如何响应。” Right, that makes sense... so we have to tell CloudFront this:是的,这是有道理的……所以我们必须告诉 CloudFront:

Instead of caching my pages based on just the requested path, I need you to also forward the Referer: or User-Agent: or one of several other headers as sent by the browser, and cache the response for use on other requests that include not only the same path, but also the same values for the extra header(s) that you forward to me .除了根据请求的路径缓存我的页面之外,我还需要您转发Referer:User-Agent:或浏览器发送的其他几个标头之一,并缓存响应以用于其他请求,包括 not只有相同的路径,而且您转发给我的额外标题的值也相同

However, when the origin server is S3, CloudFront doesn't support forwarding most request headers, on the assumption that since static content is unlikely to vary, these headers would just cause it to cache multiple identical responses unnecessarily.但是,当源服务器是 S3 时,CloudFront 不支持转发大多数请求标头,假设由于静态内容不太可能变化,这些标头只会导致它不必要地缓存多个相同的响应。

Your solution is not to tell CloudFront that you're using S3 as the origin.您的解决方案不是告诉 CloudFront 您正在使用 S3 作为源。 Instead, configure your distribution to use a "custom" origin, and give it the hostname of the bucket to use as the origin server hostname.相反,将您的分配配置为使用“自定义”源,并为其提供存储桶的主机名以用作源服务器主机名。

Then, you can configure CloudFront to forward the Referer: header to the origin, and your S3 bucket policy that denies/allows requests based on that header will work as expected.然后,您可以将 CloudFront 配置为将Referer:标头转发到源,并且基于该标头拒绝/允许请求的 S3 存储桶策略将按预期工作。

Well, almost as expected.嗯,几乎和预期的一样。 This will lower your cache hit ratio somewhat, since now the cached pages will be cached based on path + referring page.这将在一定程度上降低您的缓存命中率,因为现在缓存页面将根据路径 + 引用页面进行缓存。 It an S3 object is referenced by more than one of your site's pages, CloudFront will cache a copy for each unique request.如果您站点的多个页面引用了一个 S3 对象,CloudFront 将为每个唯一请求缓存一份副本。 It sounds like a limitation, but really, it's only an artifact of proper cache behavior -- whatever gets forwarded to the back-end, almost all of it, must be used to determine whether that particular response is usable for servicing future requests.这听起来像是一个限制,但实际上,它只是正确缓存行为的产物——无论转发到后端的任何内容,几乎所有内容,都必须用于确定该特定响应是否可用于为未来的请求提供服务。

See http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesForwardHeaders for configuring CloudFront to whitelist specific headers to send to your origin server.请参阅http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesForwardHeaders以将 CloudFront 配置为将要发送到源服务器的特定标头列入白名单。

Important: don't forward any headers you don't need, since every variant request reduces your hit rate further.重要提示:不要转发您不需要的任何标头,因为每个变体请求都会进一步降低您的命中率。 Particularly when using S3 as the back-end for a custom origin, do not forward the Host: header, because that is probably not going to do what you expect.特别是在使用 S3 作为自定义源的后端时,不要转发Host:标头,因为这可能不会达到您的预期。 Select the Referer: header here, and test.在此处选择Referer:标题,然后进行测试。 S3 should begin to see the header and react accordingly. S3 应该开始看到标题并做出相应的反应。

Note that when you removed your bucket policy for testing, CloudFront would have continued to serve the cached error page unless you flushed your cache by sending an invalidation request, which causes CloudFront to purge all cached pages matching the path pattern you specify, over the course of about 15 minutes.请注意,当您删除存储桶策略进行测试时,CloudFront 将继续提供缓存的错误页面,除非您通过发送失效请求来刷新缓存,这会导致 CloudFront 在整个过程中清除与您指定的路径模式匹配的所有缓存页面约 15 分钟。 The easiest thing to do when experimenting is to just create a new CloudFront distribution with the new configuration, since there is no charge for the distributions themselves.试验时最简单的做法是使用新配置创建一个新的 CloudFront 分配,因为分配本身是免费的。

When viewing the response headers from CloudFront, note the X-Cache: (hit/miss) and Age: (how long ago this particular page was cached) responses.查看来自 CloudFront 的响应标头时,请注意X-Cache: (hit/miss) 和Age:此特定页面在多长时间前被缓存)响应。 These are also useful in troubleshooting.这些在故障排除中也很有用。


Update: @alexjs has made an important observation: instead of doing this using the bucket policy and forwarding the Referer: header to S3 for analysis -- which will hurt your cache ratio to an extent that varies with the spread of resources over referring pages -- you can use the new AWS Web Application Firewall service, which allows you to impose filtering rules against incoming requests to CloudFront, to allow or block requests based on string matching in request headers .更新: @alexjs做出了一个重要的观察:而不是使用存储桶策略执行此操作并将Referer:标头转发到 S3 进行分析——这会在一定程度上损害您的缓存比率,这会随着资源在引用页面上的分布而变化—— - 您可以使用新的 AWS Web 应用程序防火墙服务,该服务允许您对传入 CloudFront 的请求施加过滤规则,以根据请求标头中的字符串匹配来允许或阻止请求。

For this, you'd need to connect the distribution to S3 as as S3 origin (the normal configuration, contrary to what I proposed, in the solution above, with a "custom" origin) and use the built-in capability of CloudFront to authenticate back-end requests to S3 (so the bucket contents aren't directly accessible if requested from S3 directly by a malicious actor).为此,您需要将分发连接到 S3 作为 S3 源(正常配置,与我提出的相反,在上面的解决方案中,使用“自定义”源)并使用 CloudFront 的内置功能来验证对 S3 的后端请求(因此,如果恶意行为者直接从 S3 请求,则无法直接访问存储桶内容)。

See https://www.alexjs.eu/preventing-hotlinking-using-cloudfront-waf-and-referer-checking/ for more on this option.有关此选项的更多信息,请参阅https://www.alexjs.eu/preventing-hotlinking-using-cloudfront-waf-and-referer-checking/

Also, it may be something simple.此外,它可能很简单。 When you first upload a file to an S3 bucket, it is non-public, even if other files in that bucket are public, and even if the bucket itself is public.当您第一次将文件上传到 S3 存储桶时,它是非公开的,即使该存储桶中的其他文件是公开的,甚至该存储桶本身也是公开的。

To change this in the AWS Console, check the box next to the folder that you want to make public (the folder you just uploaded), and choose "Make public" from the menu.要在 AWS 控制台中更改此设置,请选中要公开的文件夹(您刚刚上传的文件夹)旁边的框,然后从菜单中选择“公开”。

The files in that folder (and any subfolders) will be made public, and you'll be able to serve the files from S3.该文件夹(和任何子文件夹)中的文件将公开,您将能够从 S3 提供文件。

For the AWS CLI, add the "--acl public-read" option in your command, like so:对于 AWS CLI,在您的命令中添加“--acl public-read”选项,如下所示:

aws s3 cp index.html s3://your.remote.bucket --acl public-read

I identified another reason why CloudFront can return a 403 (Bad request) .我确定了 CloudFront 可以返回403 (Bad request)另一个原因。 Maybe thats an edge case but I would like to share with you.也许那是一个边缘情况,但我想与你分享。

CloudFront implements a forward loop detection mechanism to prevent from forwarding-loop attacks. CloudFront 实施了前向循环检测机制以防止转发循环攻击。
You cannot cascade more than 2 CloudFront distributions as orgins according to the AWS support.根据 AWS 支持,您不能级联超过 2 个 CloudFront 分配作为源。

Lets assume you have configured CloudFront A with CloudFront B as an origin and from CloudFront B you have configured CloudFront C as an origin, and from CloudFront C you have an S3 bucket as an origin.假设您已将 CloudFront A 配置为 CloudFront B 作为源,并且在 CloudFront B 中您已将 CloudFront C 配置为源,而在 CloudFront C 中您有一个 S3 存储桶作为源。

A --> B --> C --> S3 bucket (can return a 403 error)

If you request a file from CloudFront A that is located in the S3 bucket at the end of the cascade, the CloudFront C will return a 403 (Bad request).如果您在级联结束时从位于 S3 存储桶中的 CloudFront A 请求文件,CloudFront C 将返回 403(错误请求)。

If your cascade just consists of 2 CloudFront distributions and an S3 bucket at the end, the request of a file from the S3 origin works.如果您的级联最后只包含 2 个 CloudFront 分配和一个 S3 存储桶,则来自 S3 源的文件请求有效。

A --> B --> S3 bucket (works)

For me, I had to give CodePipeline access to my S3 bucket policy.对我来说,我必须让 CodePipeline 访问我的 S3 存储桶策略。 For example something like this:例如这样的事情:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PublicReadGetObject",
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::mys3bucket/*"
        }
    ]
}

My requirement was to make bucket private so i used OAI, the main issue i faced was i created OAI before distribution creation and choose it in origin section dropdown and cloudfront started throwing me 403. I fixed this by letting cloudfront create OAI while creating cloudfront origin (i chose the origin domain name from dropdown and selected the bucket then it gave option to restrict s3 bucket, then you will get option to create Origin Access Identity and one more option called Grant Read Permissions on Bucket, let aws/cloudfront handle it)我的要求是将存储桶设为私有,所以我使用了 OAI,我面临的主要问题是我在创建发行版之前创建了 OAI 并在源部分下拉列表中选择了它,而 cloudfront 开始向我抛出 403。我通过让 cloudfront 在创建 cloudfront 源时创建 OAI 来解决这个问题(我从下拉列表中选择了原始域名并选择了存储桶,然后它提供了限制 s3 存储桶的选项,然后您将获得创建源访问身份的选项和一个名为在存储桶上授予读取权限的选项,让 aws/cloudfront 处理它)

sometimes aws might fail to add permission for OAI in s3 bucket, use this document to add permission manually有时 aws 可能无法在 s3 存储桶中添加 OAI 的权限,请使用此文档手动添加权限

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html#private-content-granting-permissions-to-oai https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html#private-content-granting-permissions-to-oai

Also makesure you have given the entry point in both s3 and cloudfront (index.html in my case)还要确保您在 s3 和 cloudfront 中都提供了入口点(在我的情况下为 index.html)

i have not created any error pages in cloudfront, hope it saves someones time我没有在 cloudfront 中创建任何错误页面,希望它能节省别人的时间

Edit: Reloading page was throwing 403 error, so i added error pages for 403 and 404 and page as "/index.html" in cloudfront编辑:重新加载页面抛出 403 错误,所以我在 cloudfront 中添加了 403 和 404 的错误页面和页面为“/index.html”

I was getting 403 error from cloudfront for POST requests, where my origin was a domain name instead of an s3 bucket.对于POST请求,我从 cloudfront 收到 403 错误,其中我的来源是域名而不是 s3 存储桶。

The reason was that POST is not allowed by default by cloudfront.原因是 cloudfront 默认不允许POST I enabled POST from the Behaviors tab in the console, and then it worked.我从控制台的“ Behaviors选项卡启用了POST ,然后它就起作用了。

在此处输入图片说明

一个问题可能是您没有指定 CNAME(特定的或通配符),当您尝试使用域名时,它不起作用,但可以使用 CF Distro url

I was facing similar issue, but in my case in my bucket policy, I had mentioned only bucket ARN in resources section.我遇到了类似的问题,但在我的存储桶策略中,我在资源部分只提到了存储桶 ARN。 Instead of that I needed to mention bucketname/* to allow access to all objects in that bucket.而不是我需要提及 bucketname/* 以允许访问该存储桶中的所有对象。 Thought It might be helpful for some people who are facing similar issue.认为这可能对一些面临类似问题的人有所帮助。

I resolved by updating the origin domain under my cloudfront distribution我通过更新我的 cloudfront 发行版下的原始域来解决

Under origin tabs edit the origin name don't select the bucket name from the list directly rather copy the static website hosting from your s3 bucket (check under properties tab)在源选项卡下编辑源名称不要直接从列表中选择存储桶名称,而是从您的 s3 存储桶复制静态网站托管(在属性选项卡下检查)

test.uk.s3-website.eu-west-2.amazonaws.com

In my case I'm using subfolders in the same s3 bucket to deploy multiple react applications.在我的情况下,我使用同一个 s3 存储桶中的子文件夹来部署多个反应应用程序。 This is what I've done.这就是我所做的。

  1. make sure the s3 policy has "Resource": "arn:aws:s3:::s3bucketname/ ", the " " is important.确保 s3 策略具有“资源”:“arn:aws:s3:::s3bucketname/ ”,“ ”很重要。
  2. on the CloudFront distribution, under origins tab, make sure you don't select from the dropdown, rather just copy the website hosting endpoint from the S3 properties tab without the http://.在 CloudFront 分发版的“源”选项卡下,确保您没有从下拉列表中选择 select,而只需从没有 http:// 的 S3 属性选项卡中复制网站托管端点。
  3. invalidate with "/*" if necessary.必要时用“/*”使无效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM