简体   繁体   中英

Removing hash from URL on HTTP redirect

I just added a feature on a website to allow users to log in with Facebook. As part of the authentication workflow Facebook forwards the user to a callback URL on my site such as below.

https://127.0.0.1?facebook-login-callback?code.....#_=_

Note the trailing #_=_ (which is not part of the authentication data, Facebook appears to add this for no clear reason)

Upon receiving the request in the backend I validate the credentials, create a session for the user, then forward them to the main site using a Location header in the HTTP response.

I've inspected the HTTP response via my browser developer tools and confirmed I have set the location header as.

Location: https://127.0.0.1/

The issue is that the URL that appears in the browser address bar after forwarding is https://127.0.0.1/#_=_

I don't want the user to see this trailing string. How can I ensure it is removed when redirecting the user to a new URL?

The issue happens in all browsers I have tested. Chrome, Firefox, Safari and a few others

I know a similar question has been answered in other threads however there is no jquery or javascript in this workflow as in the other threads. All the processing of the login callback happens in backend code exlusively.

EDIT

Adding a bounty. This has been driving up the wall for some time. I have no explanation and don't even have a guess as to what's going on. So I'm going to share some of my hard earned Stackbux with whoever can help me.

Just To be clear on a few points

  • There is no Javascript in this authentication workflow whatsoever
  • I have implemented my own Facebook login workflow without using their Javascript libraries or other third party tools, it directly interacts with the Facebook REST API using my own Python code in the backend exclusively.

Below are excerpts from the raw HTTP requests as obtained from Firefox inspect console.

1 User connects to mike.local/facebook-login and is forwarded to Facebook's authentication page

HTTP/1.1 302 Found
Server: nginx/1.19.0
Date: Sun, 28 Nov 2021 10:44:30 GMT
Content-Type: text/plain; charset="UTF-8"
Content-Length: 0
Connection: keep-alive
Location: https://www.facebook.com/v12.0/dialog/oauth?client_id=XXX&redirect_uri=https%3A%2F%2Fmike.local%2Ffacebook-login-callback&state=XXXX

2 User accepts and Facebook redirects them to mike.local/facebook-login-callback

HTTP/3 302 Found
location: https://mike.local/facebook-login-callback?code=XXX&state=XXX#_=_

... Requested truncated here. Note the offending #_=_ in the tail of the Location

3 Backend processes the tokens Facebook provides via the user forwarding, and creates a session for the user then forwards them to mike.local . I do not add #_=_ to the Location HTTP header as seen below.

HTTP/1.1 302 Found
Server: nginx/1.19.0
Date: Sun, 28 Nov 2021 10:44:31 GMT
Content-Type: text/plain; charset="UTF-8"
Content-Length: 0
Connection: keep-alive
Location: https://mike.local/
Set-Cookie: s=XXX; Path=/; Expires=Fri, 01 Jan 2038 00:00:00 GMT; SameSite=None; Secure;
Set-Cookie: p=XXX; Path=/; Expires=Fri, 01 Jan 2038 00:00:00 GMT; SameSite=Strict; Secure;

4 User arrives at mike.local and sees a trailing #_=_ in the URL. I have observed this in Firefox, Chrome, Safari and Edge.

URL 末尾的神秘字符

I have confirmed via the Firefox inspect console there are no other HTTP requests being sent. In other words I can confirm 3 is the final HTTP response sent to the user from my site.

According to RFC 7231 §7.1.2 :

If the Location value provided in a 3xx (Redirection) response does not have a fragment component, a user agent MUST process the redirection as if the value inherits the fragment component of the URI reference used to generate the request target (ie, the redirection inherits the original reference's fragment, if any).

If you get redirected by Facebook to an URI with a fragment identifier, that fragment identifier will be attached to the target of your redirect. (Not a design I agree with much; it would make sense semantically for HTTP 303 redirects, which is what would logically fit in this workflow better, to ignore the fragment identifier of the originator. It is what it is, though.)

The best you can do is clean up the fragment identifier with a JavaScript snippet on the target page:

<script async defer type="text/javascript">
    if (location.hash === '#_=_') {
        if (typeof history !== 'undefined' &&
            history.replaceState &&
            typeof URL !== 'undefined')
        {
            var u = new URL(location);
            u.hash = '';
            history.replaceState(null, '', u);
        } else {
            location.hash = '';
        }
    }
</script>

Alternatively, you can use meta refresh/the Refresh HTTP header, as that method of redirecting does not preserve the fragment identifier:

<meta http-equiv="Refresh" content="0; url=/">

Presumably you should also include a manual link to the target page, for the sake of clients that do not implement Refresh .

But if you ask me what I'd personally do: leave the poor thing alone . A useless fragment identifier is pretty harmless anyway, and this kind of silly micromanagement is not worth turning the whole slate of Internet standards upside-down (using a more fragile, non-standard method of redirection; shoving yet another piece of superfluous JavaScript the user's way) just for the sake of pretty minor address bar aesthetics. Like The House of God says: 'The delivery of good medical care is to do as much nothing as possible'.

Not a complete answer but a couple of wider architectural points for future reference, to add to the above answer which I upvoted.

AUTHORIZATION SERVER

If you enabled an AS to manage the connection to Facebook for you, your apps would not need to deal with this problem.

An AS can deal with many deep authentication concerns to externalize complexity from apps.

SPAs

An SPA would have better control over processing login responses, as in this code of mine which uses history.replaceState .

SECURITY

An SPA can be just as secure as a website with the correct architecture design - see this article .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM