简体   繁体   中英

How to disallow part of a string in robots.txt for Wordpress

I have the following setup in my wordpress robots.txt file. For some reason the allow part of this isn't working? According to google webmaster tools it doesn't like the following.

Can anyone tell me why?

Disallow: /blog/author/*
Allow: /blog/author/admin

Thanks! :)

The trailing * is unnecessary. The robots.txt convention is that the Disallow expression will block any URL that starts with the expression. The original robots.txt specification didn't have wildcards. With wildcards, /blog/author/ and /blog/author/* mean the same thing.

The original robots.txt specification says that bots are to read the robots.txt file and apply the first matching rule . Although the original spec didn't include the Allow directive, early implementors continued to use the "first matching rule" rule. If Googlebot is using that, then it would see the disallow line and assume that it can't crawl /blog/author/admin, because it matches.

I would suggest moving the Allow above the Disallow , and removing the asterisk from the Disallow expression.

I think what you trying to do in your WordPress robots.txt is the same you can see in this case webbingbcn.es/robots.txt but allowing /wp-admin/.

  • Allow: /wp-admin/
  • Disallow: /author/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM