简体   繁体   中英

Regex url get everything after the pathname

How would I go about getting everything after the hostname in javascript?

So far this is the regex that I have but I now need to capture after starting with the first / till the end of the string.

https?\:\/\/(.*)

String

http://www.myurl.com/en/country/belgium/

So for the string I need to capture:

/en/country/belgium/

I have been toying with this example even after reading up on regex if anybody could take a couple minutes to provide me with an example that would be really nice.

Edit

To be clear I am using document.referrer here and to my knowledge this does not come with helpers like document.location does.

You should use the URL Class instead:

var url = new URL('http://www.myurl.com/en/country/belgium/');
console.log(url.pathname); // /en/country/belgium/

url;
/*
URL {
    hash: "",
    host: "www.myurl.com",
    hostname: "www.myurl.com",
    href: "http://www.myurl.com/en/country/belgium/",
    origin: "http://www.myurl.com",
    password: "",
    pathname: "/en/country/belgium/",
    port: "",
    protocol: "http:",
    search: "",
    searchParams: URLSearchParams {},
    username: ""
}
*/

More info: https://developer.mozilla.org/en-US/docs/Web/API/URL

Since you need to parse a url in string, you can use regex.

Logic:

  • Start matching with http[s]* . This will check for http and for https
  • Then check for ://
  • Now you will have to match hostname. For this, you can search for next / and accept anything after it.

 var str = 'http://www.myurl.com/en/country/belgium/'; var pathNameRegex = /http[s]*:\\/\\/[^\\/]+(\\/.+)/; var matches = str.match(pathNameRegex); console.log(matches[1]); 

Use URL object.

var url = new URL("http://www.myurl.com/en/country/belgium/");
console.log(url.pathname);

UPDATE: Using anchor tag to polyfill URL (I'm not sure if this is complete polyfill for everyghing that URL does but should be enough for your task):

if (typeof URL === 'undefined') {
    var URL = function(url) {
        var a = document.createElement('a');
        a.href = url;
        return a;
    }
}

var url = new URL('https://www.example.com/pathname/');
var path = url.pathname;

Just create an anchor and let the browser parse it. Works everywhere

 var a = document.createElement('a'); a.href = 'http://www.myurl.com/en/country/belgium/'; // or document.referrer var path = a.pathname; console.log(path); 

Without regex, you can use the following:

var pathArray = location.href.split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var baseUrl = protocol + '//' + host;
var nonBaseUrl = window.location.href.replace(baseUrl, '');

You can achieve that with a simple replace.

 var url = 'http://www.myurl.com/en/country/belgium/'; var path = url.replace(/https?:\\/\\/[^\\/]+/g,''); console.log(path);//prints /en/country/belgium/ 

But if you want to capture the path you can use the same regex with a capture group

 var url = 'http://www.myurl.com/en/country/belgium/'; var regex = /https?:\\/\\/[^\\/]+(.*)/g; var match = regex.exec(url); console.log(match[1]); //prints /en/country/belgium/ 

I suggest:

/https?:\/\/[^\s\/]*(\/\S*)/

[^\\s\\/] is a character class that excludes whitespaces and slashes.

\\S is a shorthand character class that matches all characters except white spaces.

Note that : isn't a special character and doesn't need to be escaped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM