简体   繁体   中英

How do I use a regex in Javascript to extract specific parts of a URL path

Right now, I'm trying to take a URL in this format:

https://www.example.com/{section}/posts/{number}

and get section and number. I need to do it with a regex; I cannot just break it into a parts array. I have tried:

var sect = myURL.match('https://www.example.com/[^/]+');

but I get as output "https://www.example.com/{section}" I want to be able to get the section and the number . How do I do this in Javascript?

You can assign output of matches to multiple variables like this:

 var myURL = 'https://www.example.com/mysection/posts/1234'; [$0, sec, num] = myURL.match(/^https?:\/\/www\.example\.com\/([^\/]+)\/posts\/(\d+)\/?$/); console.log(sec) //=> mysection console.log(num) //=> 1234

RegEx Details:

  • ^ : Start
  • https?:\/\/www\.example\.com\/ :
  • ([^\/]+) : Match 1+ of any character that is not / and capture as group #1
  • \/posts\/ : Match /posts/
  • (\d+) : Match 1+ digits and capture as group #2
  • \/?$ : Match an optional trailing / before end

If you don't have to validate that the string is in fact a URL then just split it on forward slashes.

 var parts = `https://www.example.com/{section}/posts/{number}`.split(/\//); console.log(parts[3]); console.log(parts[5]);

If you "must" use regex match then:

 var matches = `https://www.example.com/{section}/posts/{number}`.match(/.*\/(?<section>[^\/]+)\/posts\/(?<number>.+)/); console.log(matches.groups['section']); console.log(matches.groups['number']);

One of cause needs to retrieve this kind of path information just from an URL 's pathname via eg the named capturing groups of an accordingly written RegExp .

For the provided example the url's pathname will be...

/FOOBARBAZ/posts/987

.., thus a regex which uses named capture groups does look like...

/\/(?<section>[^\/]+)\/posts\/(?<number>[^\/?#]+)/

... which reads like...

  • \/(?<section>[^\/]+) ... match a single slash then capture any sequence of characters that do not equal a slash, and name this capture group section ... then...
  • \/posts ... match a single slash and the sequence posts ... then...
  • \/(?<number>[^\/?#]+) ... match a single slash then capture any sequence of characters that are not equal to slash, question mark and hash, and name this capture group number .

 const { section, number } = new URL('https://www.example.com/FOOBARBAZ/posts/987').pathname.match(/\/(?<section>[^\/]+)\/posts\/(?<number>[^\/?#]+)/).groups; console.log({ section, number });
 .as-console-wrapper { min-height: 100%;important: top; 0; }

The same capturing approach without named groups does look like that...

 const [ section, number ] = new URL('https://www.example.com/FOOBARBAZ/posts/987').pathname.match(/\/([^\/]+)\/posts\/([^\/?#]+)/).slice(1); console.log({ section, number });
 .as-console-wrapper { min-height: 100%;important: top; 0; }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM