简体   繁体   English

Appengine Apps与Google bot网络抓取工具

[英]Appengine Apps Vs Google bot web crawler

i built an appengine web app cricket.hover.in. 我建立了一个appengine网络应用cricket.hover.in。 The web app consists of about 15k url's linked in it, But even after a long time of my launch, no pages are indexed on google. 该网络应用包含大约15,000个链接的URL,但是即使启动了很长时间,也不会在Google上为任何页面建立索引。

Any base link place on my root site hover.in are being indexed with in minutes. 我的根站点hover.in上的所有基本链接都在几分钟内被索引了。 but i placed the same link home page of root site a long back. 但是我将根站点的相同链接主页放了很久。 but its of no use. 但它没有用。

can any one analyse , if there is any issue with cricket.hover.in or if bots have any issues with Google app engine cricket.hover.in是否存在任何问题,或者机器人与Google App Engine是否存在任何问题,任何人都可以进行分析

actually tested the url using labs app of webmaster tools of google there the return is fine and html is clear. 实际使用Google网站管理员工具的实验室应用测试了该网址,返回的结果很好,HTML清晰可见。

but when tested the same (cricket.hover.in) at the following urls its showing different results of failure 但是在以下网址测试相同(cricket.hover.in)时,其显示出不同的失败结果

www.dnsqueries.com/en/googlebot_simulator.php www.dnsqueries.com/en/googlebot_simulator.php

www.smart-it-consulting.com/internet/google/googlebot-spoofer/ www.smart-it-consulting.com/internet/google/googlebot-spoofer/

but if i test some of my php or word press links at the above url's the results are good and fine. 但是,如果我在上述网址上测试了一些php或word按链接,则结果很好。


Sorry my mistake in question, excuse me for misleading. 对不起,我所犯的错误,对不起我的误导。 the domain is cricket.trak.in/, its referred from the base URL trak.in. 域是cricket.trak.in/,从基础URL trak.in引用。 made the mistake in half mind of not finding a solution for the issue after a long investigation. 经过长时间的调查后,三心二意的错误没有找到解决问题的办法。 please check with this domain. 请检查此域名。

submitted site map 3 days back it almost submitted 22k uri in total, but by the present day indexed count is still 0. 3天前提交的站点地图,几乎总共提交了22k uri,但是到今天为止,索引计数仍然为0。

secondly cricket.trak.in itself doesn't return 15k uri, what i mean is the site in total if crawled would return about 15k uri. 其次cricket.trak.in本身不会返回15,000 uri,我的意思是,如果爬网的网站总数将返回15,000 uri。

Well, from this corner of the cyberspace, there is no such domain cricket.hover.in. 好吧,在网络空间的这个角落,没有这样的域名cricket.hover.in。

$ dig cricket.hover.in.
; <<>> DiG 9.6.1-P2 <<>> cricket.hover.in.
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 30665

I'd also guess that a URL that returns 15k href s is considered utterly useless spam by many spiders which will ignore it even if they can reach it. 我还要猜测,返回15k href s的URL被许多蜘蛛视为完全无用的垃圾邮件,即使它们可以访问它,它们也将忽略它。

Does your site have proper sitemaps and have you pushed them to Google and other search engines? 您的网站是否有正确的站点地图 ,您是否将其推到Google和其他搜索引擎? I can't check because http://cricket.hover.in gives me a 404, so it could be a DNS problem. 我无法检查,因为http://cricket.hover.in给了我404,所以可能是DNS问题。 What happens when you point your browser to that URL? 将浏览器指向该URL时会发生什么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM