简体繁体 English

设置angularjs应用进行爬网

[英]Setting up angularjs app for crawling

原文 2013-08-23 00:49:20 6 1 angularjs/ google-search

I have an angular app set up to work in html5 mode with a #! 我有一个角度应用程序，设置为在带＃的html5模式下工作！ fallback, so on most browsers it works with http://example.com/foo/bar and on less cool browsers we get http://example.com/#!/foo/bar . 回退，因此在大多数浏览器上，它都可以与http://example.com/foo/bar而在性能较差的浏览器上，我们可以获得http://example.com/#!/foo/bar 。 All that seems to work. 所有这些似乎都有效。

I have been going through trying to get google to crawl the site nicely, and it doesn't seem to be working as expected. 我一直在尝试让Google很好地抓取该网站，但它似乎没有按预期工作。 I have set up <meta property="fragment" content="!" /> 我已经设置了<meta property="fragment" content="!" /> <meta property="fragment" content="!" /> in the page to signify to google to recrawl with ?_escaped_fragment_= , and set up nginx to redirect to a static version of the page when it receives a request like this. 页面中的<meta property="fragment" content="!" />表示Google要使用?_escaped_fragment_=进行抓取，并设置nginx以便在收到此类请求时重定向到页面的静态版本。

It is working for the front page - looking in the access logs I can see http://example.com/?_escaped_fragment= and can google "A sentence from the front page" and get the home page back as a result. 它正在为头版-看在访问日志，我可以看到http://example.com/?_escaped_fragment= ，可以“从头版一句”谷歌并获得主页回到结果。

However it is not working for any of the interior pages, if I look in the access logs I can see a whole bunch of http://example.com/foo/bar/?_escaped_fragment_= rather than http://example.com/?_escaped_fragment_=/foo/bar/ as I might have expected. 但是它是不工作的任何内部网页，如果我期待在访问日志中我可以看到一大堆http://example.com/foo/bar/?_escaped_fragment_=而非http://example.com/?_escaped_fragment_=/foo/bar/如我所料。

Is there anything obvious I am missing to make google do what I want it to? 有什么明显的我想让Google做到我想要的东西吗？

1 个解决方案

I think that is for AngularJS apps with HTML5 routes, and indeed, you should see requests with just ?_escaped_fragment_=, not ?_escaped_fragment_=/foo/bar/. 我认为这是针对具有HTML5路由的AngularJS应用程序，的确，您应该看到仅带有？_escaped_fragment_ =而不是？_escaped_fragment _ = / foo / bar /的请求。 For more info check section "3. Handle pages without hash fragments" here, https://developers.google.com/webmasters/ajax-crawling/docs/getting-started . 有关更多信息，请参见https://developers.google.com/webmasters/ajax-crawling/docs/getting-started中的 “ 3.处理没有哈希片段的页面”部分。