简体   繁体   English

提交 sitemap.xml 是否会改变将来抓取/发现站点新页面的方式?

[英]Does submitting sitemap.xml once change the way the site's new pages will be crawled/discovered in the future?

My static site doesn't generate sitemap.xml automatically.我的静态站点不会自动生成sitemap.xml

Hundreds of pages have been modified recently (meta description, etc.).最近修改了数百个页面(元描述等)。 I would like Google to recrawl the whole website.我希望 Google 重新抓取整个网站。 Submitting the URLs one by one in Search Console is not an option.在 Search Console 中逐个提交 URL 不是一种选择。

So I'm about to create one sitemap.xml , and submit it on the Google Search Console.所以我要创建一个sitemap.xml ,并将其提交到 Google Search Console。

Question: If I submit a sitemap.xml only once, will the GoogleBot still continue to crawl the website on its own in the long-term, and discover new pages on its own or not?问:如果我只提交一次sitemap.xml ,长期GoogleBot还会继续自己抓取网站,自己发现新页面吗?

Or does submitting a sitemap.xml URL once will send an information to GoogleBot like "No need to crawl automatically in the future, just use this sitemap.xml" ?或者提交一次sitemap.xml URL 是否会向GoogleBot 发送诸如“以后无需自动抓取,只需使用此sitemap.xml”之类的信息 and eventually this will force me to maintain an up-to-date sitemap.xml ?最终这将迫使我维护一个最新的sitemap.xml (I did not plan to do this, I wanted to do a one-shot sitemap.xml and then let GoogleBot find new future pages on its own). (我不打算这样做,我想做一个一次性的sitemap.xml ,然后让 GoogleBot 自己找到新的未来页面)。

Google will crawl any page that is accessible with or without a sitemap.谷歌会抓取任何有或没有站点地图都可以访问的页面。 If your pages has changed, even if you don't submit a sitemap.如果您的页面发生了变化,即使您没有提交站点地图。 Google will crawl them again.谷歌将再次抓取它们。

If you want to upload a sitemap, it will not restrict google to these links only.如果您想上传站点地图,它不会将 google 限制为仅这些链接。

Submitting an XML sitemap may or may not get Googlebot to come re-crawl your entire site more quickly than it otherwise would.提交 XML 站点地图可能会也可能不会让 Googlebot 以比其他方式更快的速度重新抓取您的整个网站。 Google has suggested using temporary sitemaps to trigger recrawls so it may be worth a try. Google 建议使用临时站点地图来触发重新抓取,因此值得一试。

One way to speed this up could be to submit a temporary sitemap file listing these URLs with the last modification date (eg, when you changed them to 404 or added a noindex), so that we know to recrawl & reprocess them.加快速度的一种方法可能是提交一个临时站点地图文件,列出这些 URL 的最后修改日期(例如,当您将它们更改为 404 或添加 noindex 时),以便我们知道重新抓取和重新处理它们。

Googlebot will probaby re-crawl most of your pages within a couple of weeks anyway.无论如何,Googlebot 可能会在几周内重新抓取您的大部分网页。 Submitting a temporary sitemap may speed up the process some, but Googlebot may just say "I already know about these URLs" and not recrawl just because they are in a sitemap for the first time.提交临时站点地图可能会加快该过程,但 Googlebot 可能只是说“我已经知道这些 URL”,而不会仅仅因为它们是第一次出现在站点地图中而重新抓取。

If you are going to submit an XML sitemap, you should either keep it up to date going forward, or remove it after it serves its purpose.如果您要提交 XML 站点地图,您应该保持它的最新状态,或者在达到其目的后将其删除。 XML sitemaps don't give much control over which pages get indexed or how well they get ranked. XML 站点地图无法控制哪些页面被索引或它们的排名。 At best they get Googlebot to come crawl all your URLs, provide a signal to Google about which of your URLs are canonical, and give you extra stats in Google Search Console.充其量,他们让 Googlebot 抓取您的所有网址,向 Google 提供信号,告知您哪些网址是规范的,并在 Google Search Console 中为您提供额外的统计信息。 See The Sitemap Paradox .请参阅站点地图悖论 If you don't keep the sitemap up-to-date, it may confuse Google regarding your preferred URLs, and stats in Google Search Console won't be useful.如果您不及时更新站点地图,则 Google 可能会混淆您的首选 URL,并且 Google Search Console 中的统计信息将无济于事。

Having said that, having an old outdated sitemap won't prevent new pages on your site from getting crawled and indexed.话虽如此,拥有过时的旧站点地图不会阻止您网站上的新页面被抓取和编入索引。 Google doesn't limit crawling to just the sitemap, nor does Google index only the pages in the sitemap. Google 不会将抓取仅限于站点地图,也不会仅将站点地图中的页面编入索引。 When you have an XML sitemap, but Google indexes a page that is not included in it, Google will give you a warning in Google Search Console saying "indexed but not submitted in sitemap."当您有一个 XML 站点地图,但 Google 索引了一个未包含在其中的页面时,Google 会在 Google Search Console 中向您发出警告,提示“已索引但未在站点地图中提交”。 See Google says an indexed page is not in the sitemap even though it is in the sitemap for an example.参见谷歌说一个索引页面不在站点地图中,即使它在站点地图中作为示例。 Avoiding these warnings is another reason not to leave an outdated sitemap in place.避免这些警告是不保留过时站点地图的另一个原因。

TLDR: I would recommend one of the following actions: TLDR:我建议采取以下措施之一:

  • Don't submit the sitemap as planned and just let Googlebot re-crawl your site on its own time.不要按计划提交站点地图,而是让 Googlebot 根据自己的时间重新抓取您的站点。
  • Submit the sitemap but treat it as temporary and take it back down after a couple weeks.提交站点地图,但将其视为临时的,并在几周后将其取回。
  • Submit the sitemap and keep it updated by regenerating it periodically going forward.提交站点地图并通过定期重新生成它来保持更新。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM