简体   繁体   English

禁用抓取子网域Google抓取工具

[英]disable crawling subdomain google crawler

i would like to know how i can disallow google the crawling of my subdomains ? 我想知道如何禁止Google对我的子域进行爬网?

i made a pic of my webspace folder. 我做了我的webspace文件夹的图片。 the awesom media folder is the folder where the main site www.awesom-media.de is. awesom媒体文件夹是www.awesom-media.de主站点所在的文件夹。 folders 文件夹

the other once are subdomains. 另一个曾经是子域。 what i whant is that google should not crawl this one but i dont know how . 我想知道的是,谷歌不应该抓取这个,但是我不知道怎么做。

i dont have a robot.txt in the awesom media folder but as u can see in the / part. 我在awesom媒体文件夹中没有robot.txt,但正如您在/部分中所见。 and the content of the robot.txt is User-agent: * Disallow: 并且robot.txt的内容是User-agent: * Disallow:

and thats it. 就是这样。

how can i tell google not to crawl the subdomains 我该如何告诉Google不要抓取子网域

In case all your subdomains directly route to the specific folders (eg something like automagazin.awesom-media.de uses the folder auto-magazin ), just place a robots.txt with 如果您的所有子域都直接路由到特定文件夹(例如,诸如automagazin.awesom-media.de使用文件夹auto-magazin ),只需将robots.txt放在

User-agent: *
Disallow: /

in all your folders for the subdomains you want to disallow for Google. 在您要禁止Google使用的子域的所有文件夹中。 I guess these are auto-magazin and future-magazin (and maybe more). 我猜这些是auto-magazinfuture-magazin (也许还有更多)。

Currently you put it into the root folder, which Google probably cannot see at all. 目前,您将其放置在Google可能根本看不到的根文件夹中。 Just try to load [subdomain].awesom-media.de/robots.txt and see if it loads a robot.txt or not. 只需尝试加载[subdomain] .awesom-media.de / robots.txt,然后查看是否加载了robot.txt。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM