简体   繁体   中英

Disallow robots can be bypassed with htaccess?

I have a simple question. Let's say that I have this in robots.txt:

User-agent: *
Disallow: /

And something like this in .htaccess:

RewriteRule ^somepage/.*$ index.php?section=ubberpage&parameter=$0

And of course in index.php something like:

$imbaVar = $_GET['section']
// Some splits some whatever to get a specific page

include("pages/theImbaPage.html") // Or php or whatever

Will the robots be able to see what's in that html included by the script ( site.com/somepage )? I mean... the URL points to an inaccessible place... (the /somepage is disallowed) but still it is redirected to a valid place ( index.php ).

不能。禁止机器人访问,机器人不允许浏览您网站上的任何网页,并且他们遵守您的规则

Assuming the robots will respect the robots.txt , then it wouldn't be able to see any page in the site at all (you stated you used Disallow: / .

If the robots however do not respect your robots.txt file, then they would be able to see the content, as the redirection is made server side.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM