简体   繁体   English

在单页应用程序中,处理错误 URL(404 错误)的正确方法是什么?

[英]In a single-page app, what is the right way to deal with wrong URLs (404 errors)?

I am currently writing a web application using angularjs, but I think this question applies to any client-side javascript framework that does routing on the client side ( as angular does ).我目前正在使用 angularjs 编写一个 Web 应用程序,但我认为这个问题适用于任何在客户端进行路由的客户端 javascript 框架(就像 angular 一样)。

In a single-page app, what is the right way to deal with wrong URLs?在单页应用程序中,处理错误 URL 的正确方法是什么?

Looking at a few major sites, I see that gmail will redirect to the inbox if you type any random URL below https://mail.google.com/mail/ .查看几个主要站点,我发现如果您在https://mail.google.com/mail/ 下键入任何随机 URL,gmail 将重定向到收件箱。 This happens server-side (with an http 300 code) or client-side, depending on whether the wrong path is before or after the # character.这发生在服务器端(使用 http 300 代码)或客户端,具体取决于错误路径是在 # 字符之前还是之后。 On the other hand, twitter shows a real HTTP 404 for any invalid URL.另一方面,twitter 会显示任何无效 URL 的真实 HTTP 404。 A third option would be to show a "soft" 404, a purely client-side error page.第三种选择是显示“软”404,一个纯粹的客户端错误页面。

These solutions seem appropriate for different situations.这些解决方案似乎适用于不同的情况。 Twitter wants the links to twitter users and tweets to be real links, so people can share them, post them in news articles, etc, so it is important that invalid links be recognized as such (if I have a broken link to a tweet in my website, a simple crawl will tell me that). Twitter 希望 Twitter 用户和推文的链接是真实的链接,这样人们就可以分享它们,将它们发布在新闻文章中等,因此识别无效链接很重要(如果我在我的网站,一个简单的爬行就会告诉我)。 In gmail, on the other hand, you are not expected to share links into your inbox, and I'm not even sure if the links are really permanent/persistent: it seems the url updating mostly serves the purpose of browser history navigation within the single-page app.另一方面,在 gmail 中,您不会将链接共享到收件箱中,我什至不确定这些链接是否真的是永久/持久的:似乎 url 更新主要用于浏览器历史记录导航单页应用程序。 The third approach of giving soft errors might be appropriate for situations similar to gmail, but where there is no reasonable "default" page.给出软错误的第三种方法可能适用于类似于 gmail 的情况,但没有合理的“默认”页面。

After this long introduction, here are some specific questions:经过这么长的介绍,这里有一些具体的问题:

  • Is it ever acceptable to give a "soft" error page instead of a 404 error, or should a single-page app always redirect to a real 404 if a url is invalid?提供“软”错误页面而不是 404 错误是否可以接受,或者如果 url 无效,单页应用程序是否应该始终重定向到真正的 404?
  • Gmail's code may be perfectly bugfree, but if it did have a bug leading to invalid links that end up redirecting back to the inbox, that might be even more confusing for users than an error page. Gmail 的代码可能完全没有错误,但如果它确实存在导致无效链接最终重定向回收件箱的错误,那么对于用户来说,这可能比错误页面更令人困惑。 For most web apps out there, that are not as well tested as gmail, would it be better to show an error page?对于大多数网络应用程序,它们没有像 gmail 那样经过充分测试,显示错误页面会更好吗?
  • To implement real 404s for single-page apps, it seems necessary to duplicate the routing logic on the server-side.要为单页应用实现真正的 404,似乎有必要在服务器端复制路由逻辑。 Is there any way around this?有没有办法解决?
  • When redirecting to a 404, I think the user should be able to see the URL that caused the error, possibly in the URL bar.当重定向到 404 时,我认为用户应该能够看到导致错误的 URL,可能在 URL 栏中。 With the html5 history api, I think this can be accomplished by simply triggering a reload of the current page (with the wrong url), combined with the server-side routing mentioned above.使用html5 history api,我认为这可以通过简单地触发当前页面的重新加载(使用错误的url),结合上面提到的服务器端路由来完成。 For browsers that do not support this or when using hashbang notation, this does not seem possible.对于不支持此功能或使用 hashbang 符号的浏览器,这似乎是不可能的。 What's the best way to support all browsers?支持所有浏览器的最佳方式是什么?

If you care about SEO, one of the ways that angular.io was able to solve this problem (at least with Google anyway) is by using noindex meta tag "to indicate soft-404 status which will prevent crawlers from crawling the content of the page".如果您关心 SEO, angular.io 能够解决此问题的方法之一(至少对于 Google)是使用noindex 元标记“来指示软 404 状态,这将阻止爬虫抓取页”。 Apparently it can be added to the document via JavaScript.显然它可以通过 JavaScript 添加到文档中。

Alternatively, using JavaScript, you can redirect to a page that will respond with an actual HTTP 404 status code.或者,使用 JavaScript,您可以重定向到将响应实际 HTTP 404 状态代码的页面。 Google understands JavaScript redirects just fine.谷歌理解 JavaScript 重定向就好了。 Your original /does-not-exist page, when redirected to /404-error?from=does-not-exist , will be associated with the 404 status code returned by the server.您的原始/does-not-exist页面在重定向到/404-error?from=does-not-exist ,将与服务器返回的 404 状态代码相关联。 The URL structure does not matter, only the status code and the redirect are important here. URL 结构无关紧要,这里只有状态代码和重定向很重要。

Your other options are SSR (Nuxt.js, Next.js, Angular Universal, etc) or pre-rendering (prerender.io, puppeteer, etc) which Google calls dynamic rendering where you respond to search bot requests with a pre-rendered version while human users get your normal client-side rendered app.您的其他选项是 SSR(Nuxt.js、Next.js、Angular Universal 等)或预渲染(prerender.io、puppeteer 等),Google 称之为动态渲染,您可以在其中使用预渲染版本响应搜索机器人请求而人类用户获得您的普通客户端渲染应用程序。

tl;dr: Drop hashbang support and opt for PJAX like behavior if you care about SEO. tl; dr:如果您关心 SEO,请放弃 hashbang 支持并选择类似PJAX 的行为。

Are you making an App or a Website?你是在做一个应用程序还是一个网站? If website you need to return 404 so that you don't confuse google.如果网站需要返回404以免混淆 google。 It needs be a real 404 not just show a message of page not found (ie 200 with message "page not found" is very bad).它需要是一个真正的404而不仅仅是显示找不到页面的消息(即200带有消息“找不到页面”是非常糟糕的)。 Also what browsers do you care to support?还有你关心支持哪些浏览器?

My opinion is that the whole hashbang server side rendering should be avoided (ie the nasty Google SEO #! hack).我的观点是应该避免整个 hashbang 服务器端渲染(即讨厌的 Google SEO #! hack)。 Either use real pushstate or re-render the whole page if the URL changes for browsers that don't support pushstate (not a hash change).如果不支持 pushstate 的浏览器的 URL 更改(不是哈希更改),则使用真正的 pushstate 或重新呈现整个页面。

Now the reason this matters is that a #!现在这很重要的原因是#! should never return a 404 because it doesn't make sense and its impossible to mimic server side because the server never gets whats after the #!永远不应该返回404因为它没有意义,并且不可能模仿服务器端,因为服务器永远不会在#! with out running Javascript.无需运行 Javascript。

Thus if you really care about SEO I would do something like PJAX and only use true pushstate for routing and then just fail to old web 1.0.因此,如果你真的关心 SEO,我会做一些类似 PJAX 的事情,并且只使用真正的 pushstate 进行路由,然后就无法使用旧的 web 1.0。 Consequently the links I recommend you share that can truly be a 404 should not have #!因此,我建议您分享的链接可以真正成为404不应该有#! (traditional # being fine so long as the contents of the page don't change drastically). (传统的# ,只要页面的内容不会发生剧烈变化)。

Finally the 404 is mostly not a problem but rather 30X ie redirect responses.最后, 404主要不是问题,而是30X即重定向响应。 Thats because the browser will automatically handle redirects so your Javascript AJAX calls will never see a 30X (they will get the redirect response instead... ie 200).那是因为浏览器会自动处理重定向,因此您的 Javascript AJAX 调用将永远不会看到30X (他们将获得重定向响应......即 200)。 To handle 30X responses you will have to send a header back for every request to indicate what the redirected URL is/was (ie what you were redirected to) so that you don't mess up the Pushstate History.要处理30X响应,您必须为每个请求发回一个标头,以指示重定向的 URL 是/曾经是什么(即您被重定向到什么),这样您就不会弄乱 Pushstate 历史记录。

Of course if you need to support hashbang like Twitter used too ( and they are the ones that even killed hashbang ), you can leverage Google Sitemaps and the rel=nofollow to try to mitigate bad SEO.当然,如果您也需要像 Twitter 使用的那样支持 hashbang( 它们甚至杀死了 hashbang ),您可以利用 Google Sitemaps 和rel=nofollow来尝试减轻糟糕的 SEO。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM