简体   繁体   English

了解服务器架构:使用 Nginx 反向代理或 Apache 服务器从 AWS S3 传送内容

[英]Understanding server architecture: Delivering content from AWS S3 using Nginx reverse-proxy or Apache server

The purpose of this question is to understand the strategy while designing server side architecture.这个问题的目的是在设计服务器端架构时了解策略。

Use case: I want to build a http server for an app which allow users to upload and download multimedia content (images, videos etc.) Large number of concurrent users (say, around 50k) are expected to upload/download the content.用例:我想为允许用户上传和下载多媒体内容(图像、视频等)的应用程序构建一个 http 服务器。预计会有大量并发用户(例如,大约 50k)上传/下载内容。

All the content will be stored on AWS S3 bucket.所有内容都将存储在 AWS S3 存储桶中。 Information regarding S3 bucket ie bucket name/authentication headers should be masked from the user.有关 S3 存储桶的信息,即存储桶名称/身份验证标头,应向用户屏蔽。 Since there are multiple Access Control Options ( AWS-ACL ) for S3 bucket, it would be preferable to refrain from making the bucket available for All_Users (authenticated and anonymous users).由于 S3 存储桶有多个访问控制选项 ( AWS-ACL ),因此最好不要让存储桶可供 All_Users(经过身份验证的和匿名用户)使用。 I do not want to expose the content in public domain.我不想在公共领域公开内容。

Queries查询

  • Since I want to mask AWS S3 from the users, I will need to use a web-server or reverse proxy.由于我想对用户屏蔽 AWS S3,我需要使用网络服务器或反向代理。 I have gone through multiple resources that compare Apache Vs Nginx.我浏览了多个比较 Apache 与 Nginx 的资源。 Since the server needs to deliver static content from S3 to high number of concurrent users, Nginx seems to be a better option.由于服务器需要从 S3 向大量并发用户提供静态内容,Nginx 似乎是一个更好的选择。 Isn't it??不是吗??

  • Does setting Access Control Level to S3 bucket to ALL_USERS ( to authenticated and anonymous users) compromise on data privacy?将 S3 存储桶的访问控制级别设置为 ALL_USERS(针对经过身份验证的用户和匿名用户)是否会损害数据隐私? If I use reverse proxy, there is no way for the user to determine S3 bucket urls.如果我使用反向代理,则用户无法确定 S3 存储桶 url。 Is the data safe and private?数据是否安全和私密?

  • However, if S3 bucket is made available for Authenticated users only, will nginx reverse proxy work?但是,如果 S3 存储桶仅供经过身份验证的用户使用,那么 nginx 反向代理会起作用吗? I have gone through Nginx Reverse Proxy for S3 .我已经完成了S3 的 Nginx 反向代理 In order for Nginx to work as a reverse proxy, a Pre-signed URL needs to be prepared.为了让 Nginx 作为反向代理工作,需要准备一个预签名的 URL The expiry time of pre-signed url is again a tricky decision.预签名网址的到期时间又是一个棘手的决定。 Does setting a huge expiry time for pre-signed url makes sense?为预签名的 url 设置一个巨大的到期时间有意义吗? Does it compromise on the security or privacy of data (similar to s3 access control to ALL_USERS)?它是否会损害数据的安全性或隐私性(类似于 s3 对 ALL_USERS 的访问控制)? If yes, is there a way to reverse proxy the request to dynamically generated pre-signed url (with short expiry time) via nginx only?如果是,有没有办法仅通过 nginx 将请求反向代理到动态生成的预签名 url(到期时间短)?

Any information and resources to consolidate my understanding will be really helpful.任何巩固我的理解的信息和资源都将非常有帮助。

Does setting Access Control Level to S3 bucket to ALL_USERS ( to authenticated and anonymous users) compromise on data privacy?将 S3 存储桶的访问控制级别设置为 ALL_USERS(针对经过身份验证的用户和匿名用户)是否会损害数据隐私?

Absolutely.绝对地。 Don't do it.不要这样做。

If I use reverse proxy, there is no way for the user to determine S3 bucket urls.如果我使用反向代理,则用户无法确定 S3 存储桶 url。 Is the data safe and private?数据是否安全和私密?

Theoretically, they can't determine it, but what if an error message or misconfiguration leaks the information?理论上,他们无法确定,但是如果错误消息或错误配置泄露了信息怎么办? This is security through obscurity , which gives you nothing more than a false sense of security.这是通过默默无闻的安全感,它只会给您一种虚假的安全感。 There's always a better way.总有更好的方法。

Information regarding S3 bucket ie bucket name/authentication headers should be masked from the user.有关 S3 存储桶的信息,即存储桶名称/身份验证标头,应向用户屏蔽。

The authentication mechanism of S3, with signed URLs, is designed so that there is no harm in exposing it to the user.带有签名 URL 的 S3 身份验证机制的设计目的是将其暴露给用户不会造成任何危害。 The only thing secret is your AWS Secret Key, which you'll note is not exposed in a signed URL.唯一的秘密是您的 AWS 密钥,您会注意到它不会在签名 URL 中公开。 It also can't reasonably be reverse-engineered, and a signed URL is good for only the resource and action that the signature permits.它也不能合理地进行逆向工程,并且签名 URL 仅适用于签名允许的资源和操作。

Signing URLs and presenting them to the user does not pose a security risk, although, admittedly, there are other reasons why you might not want to do that.签署 URL 并将其呈现给用户不会带来安全风险,尽管不可否认,您可能不想这样做还有其他原因。 I do that routinely -- signing a URL while a page is being rendered, with a relatively long expiration time, or signing a URL and redirecting a user to the signed URL when they click on a link back to my application server (which validates their authorization to access the resource, and then returns a signed URL with a very short expiration time, such as 5 to 10 seconds; the expiration can occur while a download is in progress without causing a problem -- the signature only needs to avoid expiring before the request to S3 is accepted).我经常这样做——在呈现页面时签署一个 URL,具有相对较长的到期时间,或者签署一个 URL 并在用户单击返回我的应用程序服务器的链接时将用户重定向到签名的 URL(这会验证他们的访问资源的授权,然后返回一个过期时间很短的签名 URL,例如 5 到 10 秒;过期可以在下载过程中发生而不会造成问题——签名只需要避免在过期之前对 S3 的请求被接受)。

However, if you want to go the proxy route (which, in addition to the above, is something I do in my systems as well), there's a much easier way than what you're envisioning: the bucket policy can be configured to permit specific permissions to be granted based on source IP addresses... of your servers.但是,如果您想使用代理路由(除上述之外,这也是我在我的系统中所做的事情),有一种比您想象的更简单的方法:可以将存储桶策略配置为允许根据源 IP 地址授予的特定权限......您的服务器。

Here's a (sanitized) policy taken directly from one of my buckets.这是直接从我的一个桶中获取的(经过消毒的)政策。 The IP addresses are from RFC-5737 to avoid the confusion that private IP addresses in this example would cause. IP 地址来自RFC-5737,以避免本示例中的私有 IP 地址会引起混淆。

These IP addresses are public IP addresses... they would be your elastic IP addresses attached to your web servers, or, preferably, to the NAT instances that the web servers use for their outgoing requests.这些 IP 地址是公共 IP 地址……它们将是附加到您的 Web 服务器的弹性 IP 地址,或者最好是附加到 Web 服务器用于其传出请求的 NAT 实例。

{
    "Version": "2008-10-17",
    "Id": "Policy123456789101112",
    "Statement": [
        {
            "Sid": "Stmt123456789101112",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::example-bucket/*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": [
                        "203.0.113.173/32",
                        "203.0.113.102/32",
                        "203.0.113.52/32",
                        "203.0.113.19/32"
                    ]
                }
            }
        }
    ]
}

What does this do?这有什么作用? If a request arrives at S3 from one of the listed IP addresses, the GetObject permission is granted to the requester.如果请求从列出的 IP 地址之一到达 S3,则向请求者授予GetObject权限。 With a proxy, your proxy's IP address will be the IP address seen by S3, and the request will be granted if it matches the bucket policy, allowing your proxies to fetch objects from S3 while not allowing the rest of the Internet to, unless alternate credentials are presented, such as with a signed URL.使用代理,您的代理的 IP 地址将是 S3 看到的 IP 地址,如果它与存储桶策略匹配,则该请求将被授予,允许您的代理从 S3 获取对象同时不允许互联网的其余部分,除非备用提供凭据,例如带有签名的 URL。 This policy doesn't "deny" anything directly, because the deny is implicit.此策略不会直接“拒绝”任何事情,因为拒绝是隐含的。 Importantly, don't upload your objects with the public-read ACL, because that would allow the objected to be downloaded by anyone.重要的是,不要使用public-read ACL 上传您的对象,因为这将允许任何人下载被反对的对象。 The default private ACL works perfectly for this application.默认private ACL 非常适合此应用程序。

S3 can grant permissions like this based on other criteria, such as the Referer: header, and you may find examples of that online, but don't do that . S3 可以根据其他条件授予这样的权限,例如Referer:标头,您可以在网上找到相关示例,但不要这样做 Trusting what the browser reports as the referring page is an extremely weak and primitive security mechanism that provides virtually no real protection -- headers are incredibly simple to spoof.信任浏览器报告为引用页面的内容是一种极其脆弱和原始的安全机制,它几乎不提供真正的保护——标题非常容易欺骗。 That sort of filtering is really only good for annoying lazy people who are hot-linking to your content.这种过滤真的只适合那些热链接到您的内容的烦人的懒人。 The source IP address is a different matter altogether, as it's not carried in a layer 7 header, and cannot be readily spoofed.源 IP 地址完全是另一回事,因为它不包含在第 7 层标头中,并且不容易被欺骗。

Because S3 only interacts with the Internet via the TCP protocol, your source addresses -- even it it were known how you had enabled the bucket to trust these addresses -- cannot be spoofed in any practical way, because to do so would mean to breach the security of AWS's core IP network infrastructure -- TCP requires the originating machine to be reachable across subnet boundaries by the source IP address it uses, and the AWS network would only ever route those responses back to your legitimately-allocated IP address, which would have no option other than to reset or discard the connections, since they were not initiated with you.由于 S3 仅通过 TCP 协议与 Internet 交互,因此您的源地址——即使知道您如何使存储桶信任这些地址——也不能以任何实际方式被欺骗,因为这样做意味着破坏AWS 核心 IP 网络基础设施的安全性——TCP 要求原始机器可以通过它使用的源 IP 地址跨子网访问,AWS 网络只会将这些响应路由回您合法分配的 IP 地址,这将除了重置或丢弃连接之外别无选择,因为它们不是由您发起的。

Note that this solution does not work in conjunction with S3 VPC endpoints which Amazon recently announced , because with S3 VPC endpoints, your source IP address (seen by S3) will be the private address, which isn't unique to your VPC... but that should not be a problem.请注意,此解决方案不能与 Amazon 最近宣布的S3 VPC 端点结合使用,因为对于 S3 VPC 端点,您的源 IP 地址(由 S3 看到)将是私有地址,这不是您的 VPC 独有的...但这应该不是问题。 I mention this caveat only in the interest of thoroughness.我提到这个警告只是为了彻底。 S3 VPC endpoints are not required and not enabled by default, and if enabled, can be provisioned on a per-subnet basis. S3 VPC 端点不是必需的,默认情况下不启用,如果启用,可以在每个子网的基础上进行配置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Apache作为Undertow作为服务器的WebSockets的反向代理 - How to use Apache as a reverse-proxy for WebSockets with Undertow as the server 如何使用 apache web 服务器反向代理到谷歌云运行服务? - How to reverse-proxy using apache web server to google cloud run services? 逆向阿帕奇 - Reverse-proxy apache 使用反向代理时500内部服务器错误 - 500 internal server error when using reverse-proxy RewriteRule 不适用于 Apache 服务器的 NGINX 反向代理 - RewriteRule not working with NGINX Reverse Proxy for Apache Server Nginx反向代理到Apache服务器-仅满足Nginx转发的请求 - nginx reverse proxy to apache server - only fulfill nginx forwarded requests 从apache服务器到mongodb的反向代理 - Reverse proxy from apache server to mongodb 在Apache Web服务器前配置NGINX反向代理 - Configure NGINX Reverse Proxy in front of Apache Web Server Apache 2.4 从负载平衡反向代理重定向不起作用但在非负载平衡反向代理中工作 - Apache 2.4 redirection from load-balanced reverse-proxy not working but working in non-load-balanced reverse-proxy 使用 SSL 从 apache 服务器 A 到 apache 服务器 B 的反向代理 - Reverse proxy from apache server A to apache server B with SSL
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM