How can we write .htaccess to block Googlebot UA from accessing URLs ending in forward-slash, followed by 4-6 digits?
We're wasting a lot of our Googlebot crawl budget because it's crawling "no-index" pages.
The plan is to use .htaccess to block the UA from URLs ending with a forward slash, followed by 4-6 digits.
Ex:
https://example.com/folder/folder/12563
https://example.com/folder/folder/125637
https://example.com/folder/folder/1563
I think the REGEX looks something like this:
\/\d{4,6}$
But how do I configure .htaccesss, and only for a specific UA (googlebot)?
Thanks!
You can use this:
RewriteEngine on
RewriteCond ℅{HTTP_USER_AGENT} googlebot [NC]
RewriteRule /\d{4,6}$ - [F,L]
This will return a Forbidden HTTP 403 error
for googlebot if they try to access the restricted URLs on your server.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.