简体   繁体   中英

Creating a robots.txt for an ASP.NET MVC site

I'm creating a robots.txt file for my website, but looking through my project structure, I'm not sure what to disallow.

Do I need to disallow standard .NET MVC directories and files like /App_Data, /web.config, /Controllers, /Models, /Global.asax? Or will those not be indexed already?

What about directories like /bin and /obj?

If I want to disallow a page, do I disallow /Views/MyPage/Index.cshtml, or /MyPage?

Also, when specifying the sitemap in the robots.txt file, can I use my Web.sitemap, or does it need to be a different xml file?

'robots.txt' refers to paths as they are publically seen from Web crawlers.

There's nothing particularly special about a crawler: it merely uses HTTP to request pages from your site precisely like a user does.

So, given that your MVC site is properly configured, files like /web.config or the paths you mention won't be visible to the outside world as neither IIS nor your application will be configured to serve them. Even if it was pointed to those files the spider would receive a 404 Not Found and continue.

Similarly, your .cshtml or .aspx content files won't be seen with those extensions. Rather, a Web crawler will see precisely what you'll show to users.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM