简体   繁体   中英

Does submitting sitemap.xml once change the way the site's new pages will be crawled/discovered in the future?

My static site doesn't generate sitemap.xml automatically.

Hundreds of pages have been modified recently (meta description, etc.). I would like Google to recrawl the whole website. Submitting the URLs one by one in Search Console is not an option.

So I'm about to create one sitemap.xml , and submit it on the Google Search Console.

Question: If I submit a sitemap.xml only once, will the GoogleBot still continue to crawl the website on its own in the long-term, and discover new pages on its own or not?

Or does submitting a sitemap.xml URL once will send an information to GoogleBot like "No need to crawl automatically in the future, just use this sitemap.xml" ? and eventually this will force me to maintain an up-to-date sitemap.xml ? (I did not plan to do this, I wanted to do a one-shot sitemap.xml and then let GoogleBot find new future pages on its own).

Google will crawl any page that is accessible with or without a sitemap. If your pages has changed, even if you don't submit a sitemap. Google will crawl them again.

If you want to upload a sitemap, it will not restrict google to these links only.

Submitting an XML sitemap may or may not get Googlebot to come re-crawl your entire site more quickly than it otherwise would. Google has suggested using temporary sitemaps to trigger recrawls so it may be worth a try.

One way to speed this up could be to submit a temporary sitemap file listing these URLs with the last modification date (eg, when you changed them to 404 or added a noindex), so that we know to recrawl & reprocess them.

Googlebot will probaby re-crawl most of your pages within a couple of weeks anyway. Submitting a temporary sitemap may speed up the process some, but Googlebot may just say "I already know about these URLs" and not recrawl just because they are in a sitemap for the first time.

If you are going to submit an XML sitemap, you should either keep it up to date going forward, or remove it after it serves its purpose. XML sitemaps don't give much control over which pages get indexed or how well they get ranked. At best they get Googlebot to come crawl all your URLs, provide a signal to Google about which of your URLs are canonical, and give you extra stats in Google Search Console. See The Sitemap Paradox . If you don't keep the sitemap up-to-date, it may confuse Google regarding your preferred URLs, and stats in Google Search Console won't be useful.

Having said that, having an old outdated sitemap won't prevent new pages on your site from getting crawled and indexed. Google doesn't limit crawling to just the sitemap, nor does Google index only the pages in the sitemap. When you have an XML sitemap, but Google indexes a page that is not included in it, Google will give you a warning in Google Search Console saying "indexed but not submitted in sitemap." See Google says an indexed page is not in the sitemap even though it is in the sitemap for an example. Avoiding these warnings is another reason not to leave an outdated sitemap in place.

TLDR: I would recommend one of the following actions:

  • Don't submit the sitemap as planned and just let Googlebot re-crawl your site on its own time.
  • Submit the sitemap but treat it as temporary and take it back down after a couple weeks.
  • Submit the sitemap and keep it updated by regenerating it periodically going forward.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM