简体   繁体   English

Ruby 使用 Amazon ECS(Fargate) 的 Rails 站点地图

[英]Ruby on Rails sitemap using Amazon ECS(Fargate)

I have scoured the interwebs for months trying to find a solution, so any guidance will be a huge help to me.为了找到解决方案,我已经在互联网上搜索了几个月,因此任何指导都会对我有很大帮助。

So my task is that I have a RoR app that is using Fargate.所以我的任务是我有一个使用 Fargate 的 RoR 应用程序。 I have a sitemap index and three sitemaps(links split up in 50k increments).我有一个站点地图索引和三个站点地图(链接以 50k 的增量拆分)。 These sitemaps needs to be accessible via my url (mysite.com/sitemap...).这些站点地图需要通过我的 url (mysite.com/sitemap...) 访问。

So from my understanding, containers are ephemeral and adding the sitemap to my public folder will have undesirable results with indexing on Google.因此,根据我的理解,容器是短暂的,将站点地图添加到我的公共文件夹将对在 Google 上建立索引产生不良结果。

I have found countless tutorials on how to upload the sitemap using Heroku via S3 - but this option appears to use the public url of the S3 and not the url from my domain.我找到了无数关于如何通过 S3 使用 Heroku 上传站点地图的教程 - 但此选项似乎使用 S3 的公共 url 而不是我域中的 url。

My guess is I need to use something like Elastic File Storage or maybe even S3 - but I am lost.我的猜测是我需要使用诸如弹性文件存储或什至 S3 之类的东西——但我迷路了。 I can even put it this way, how do companies like Airbnb and Github store their sitemaps?我什至可以这样说,Airbnb 和 Github 这样的公司如何存储他们的站点地图?

I don't know about Airbnb or Github's sitemaps, but if you can get your app running on Fargate then you can figure out anything.我不知道 Airbnb 或 Github 的站点地图,但如果您可以让您的应用程序在 Fargate 上运行,那么您就可以解决任何问题。

So from my understanding, containers are ephemeral and adding the sitemap to my public folder will have undesirable results with indexing on Google.因此,根据我的理解,容器是短暂的,将站点地图添加到我的公共文件夹将对在 Google 上建立索引产生不良结果。

It's true that containers are ephemeral, but that has nothing to do with undesirable results with Google.容器确实是短暂的,但这与谷歌的不良结果无关。

You can host the sitemaps on S3 or Elastic File Storage.您可以在 S3 或弹性文件存储上托管站点地图。 You can configure S3 to use your domain as well (see below), but I'm not sure if that is worth the effort.您也可以将 S3 配置为使用您的域(见下文),但我不确定这是否值得付出努力。

The easiest thing to do is to host the sitemaps in your public folder.最简单的做法是将站点地图托管在您的公共文件夹中。 The process would be to generate the files on your dev machine and add them to the repo.该过程将在您的开发机器上生成文件并将它们添加到存储库中。 When they are deployed, they will be in the public folder of each container and available to the Rails app.部署后,它们将位于每个容器的公共文件夹中,可供 Rails 应用程序使用。

If you decide that you don't want the Rails app to serve the sitemaps (which may make sense for certain use cases), then the next easiest thing would probably be to host it on S3.如果您决定不希望 Rails 应用程序提供站点地图(这对于某些用例可能有意义),那么下一个最简单的事情可能是将其托管在 S3 上。

You can configure S3 to use a subdomain.您可以将 S3 配置为使用子域。 I'm not sure if this would have an effect on how Google sees your site, or if the site index is supposed to be hosted on the same domain.我不确定这是否会影响 Google 查看您网站的方式,或者网站索引是否应该托管在同一域中。

If you want to host the sitemaps on S3 with your own domain, then you might be able to use CloudFront to forward all requests to your Rails app, with the exception of the sitemaps.如果您想使用自己的域在 S3 上托管站点地图,那么您可以使用 CloudFront 将所有请求转发到您的 Rails 应用程序,但站点地图除外。 The sitemaps could be served from S3.可以从 S3 提供站点地图。

Reference: Using S3 with Subdomain参考: 将 S3 与子域一起使用

EDIT: If you decide to use CloudFront, then it's not necessary to use S3.编辑:如果您决定使用 CloudFront,则没有必要使用 S3。 CloudFront can cache the sitemap for days or weeks, and your app would only serve it once in that time. CloudFront 可以将站点地图缓存数天或数周,而您的应用程序只会在这段时间内提供一次。

My guess is I need to use something like Elastic File Storage or maybe even S3 - but I am lost.我的猜测是我需要使用诸如弹性文件存储或什至 S3 之类的东西——但我迷路了。 I can even put it this way, how do companies like Airbnb and Github store their sitemaps?我什至可以这样说,Airbnb 和 Github 这样的公司如何存储他们的站点地图?

Big companies like that would certainly have a CDN in front of their website.像那样的大公司肯定会在他们的网站前面有一个 CDN。 You can also have a CDN in front of your website.你也可以在你的网站前面有一个 CDN。 The AWS solution is CloudFront, but I would also recommend looking into Cloudflare. AWS 解决方案是 CloudFront,但我还建议查看 Cloudflare。

In either case, once you have a CDN in front of your website, you can configure it to server different content from different origins, based on the URL path.在任何一种情况下,一旦您的网站前面有 CDN,您就可以根据 URL 路径将其配置为服务器来自不同来源的不同内容。 So for instance you could setup the default origin as your Ruby app, and setup the /sitemap origin as an S3 bucket that has your sitemap file in it.因此,例如,您可以将默认来源设置为您的 Ruby 应用程序,并将/sitemap来源设置为一个 S3 存储桶,其中包含您的站点地图文件。


Alternatively you could store the site map in EFS, map the EFS volume to your Fargate tasks, and configure your Ruby app (or Nginx running in front of your Ruby app?) to serve the file in the sitemap volume when a request comes in for /sitemap .或者,您可以将站点 map 存储在 EFS 中,将 EFS 卷 map 存储到您的 Fargate 任务中,并配置您的 Ruby 应用程序(或 Nginx 在您的 Ruby 应用程序前面运行?)以在收到请求时在站点地图卷中提供文件/sitemap

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM