简体   繁体   中英

Disallow query strings in robots.txt for only one url

so I have one url, chickens.com/hatching that has potential query strings it could be indexed with, ie chickens.com/hatching?type=fast . I would definitely like to keep the base url, chickens.com/hatching indexed, but no query parameters. I would like query parameters indexed on other pages, just not this one, so a catchall for all pages will not work. Secondarily, I am rewriting urls to remove trailing slashes, would this catch chickens.com/hatching/?type=fast as well as chickens.com/hatching?type=fast ??

Does this work as a solution to my issue?

Disallow: /hatching?*

I have heard this only works for google crawlers... is there a more robust solution for all crawlers?

Thanks for any help! It is greatly appreciated.

User-agent: *
Disallow: /hatching?
Disallow: /hatching/

This robots.txt will block all URLs whose path starts with /hatching? or /hatching/ , so for example:

  • /hatching?
  • /hatching?foo=bar
  • /hatching/
  • /hatching/foo
  • /hatching/?foo=bar

It's only using features from the original robots.txt specification, so all conforming bots should be able to understand this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM