简体   繁体   English

设置angularjs应用进行爬网

[英]Setting up angularjs app for crawling

I have an angular app set up to work in html5 mode with a #! 我有一个角度应用程序,设置为在带#的html5模式下工作! fallback, so on most browsers it works with http://example.com/foo/bar and on less cool browsers we get http://example.com/#!/foo/bar . 回退,因此在大多数浏览器上,它都可以与http://example.com/foo/bar而在性能较差的浏览器上,我们可以获得http://example.com/#!/foo/bar All that seems to work. 所有这些似乎都有效。

I have been going through trying to get google to crawl the site nicely, and it doesn't seem to be working as expected. 我一直在尝试让Google很好地抓取该网站,但它似乎没有按预期工作。 I have set up <meta property="fragment" content="!" /> 我已经设置了<meta property="fragment" content="!" /> <meta property="fragment" content="!" /> in the page to signify to google to recrawl with ?_escaped_fragment_= , and set up nginx to redirect to a static version of the page when it receives a request like this. 页面中的<meta property="fragment" content="!" />表示Google要使用?_escaped_fragment_=进行抓取,并设置nginx以便在收到此类请求时重定向到页面的静态版本。

It is working for the front page - looking in the access logs I can see http://example.com/?_escaped_fragment= and can google "A sentence from the front page" and get the home page back as a result. 正在为头版-看在访问日志,我可以看到http://example.com/?_escaped_fragment= ,可以“从头版一句”谷歌并获得主页回到结果。

However it is not working for any of the interior pages, if I look in the access logs I can see a whole bunch of http://example.com/foo/bar/?_escaped_fragment_= rather than http://example.com/?_escaped_fragment_=/foo/bar/ as I might have expected. 但是它是工作的任何内部网页,如果我期待在访问日志中我可以看到一大堆http://example.com/foo/bar/?_escaped_fragment_=而非http://example.com/?_escaped_fragment_=/foo/bar/如我所料。

Is there anything obvious I am missing to make google do what I want it to? 有什么明显的我想让Google做到我想要的东西吗?

I think that is for AngularJS apps with HTML5 routes, and indeed, you should see requests with just ?_escaped_fragment_=, not ?_escaped_fragment_=/foo/bar/. 我认为这是针对具有HTML5路由的AngularJS应用程序,的确,您应该看到仅带有?_escaped_fragment_ =而不是?_escaped_fragment _ = / foo / bar /的请求。 For more info check section "3. Handle pages without hash fragments" here, https://developers.google.com/webmasters/ajax-crawling/docs/getting-started . 有关更多信息,请参见https://developers.google.com/webmasters/ajax-crawling/docs/getting-started中的 “ 3.处理没有哈希片段的页面”部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM