简体   繁体   English

如何在 Javascript 中使用正则表达式来提取 URL 路径的特定部分

[英]How do I use a regex in Javascript to extract specific parts of a URL path

Right now, I'm trying to take a URL in this format:现在,我正在尝试采用这种格式的 URL:

https://www.example.com/{section}/posts/{number}

and get section and number.并获取部分和编号。 I need to do it with a regex;我需要用正则表达式来做; I cannot just break it into a parts array.我不能把它分解成一个零件数组。 I have tried:我努力了:

var sect = myURL.match('https://www.example.com/[^/]+');

but I get as output "https://www.example.com/{section}" I want to be able to get the section and the number .但我得到了 output "https://www.example.com/{section}"我希望能够获得sectionnumber How do I do this in Javascript?如何在 Javascript 中执行此操作?

You can assign output of matches to multiple variables like this:您可以将matches的 output 分配给多个变量,如下所示:

 var myURL = 'https://www.example.com/mysection/posts/1234'; [$0, sec, num] = myURL.match(/^https?:\/\/www\.example\.com\/([^\/]+)\/posts\/(\d+)\/?$/); console.log(sec) //=> mysection console.log(num) //=> 1234

RegEx Details:正则表达式详细信息:

  • ^ : Start ^ : 开始
  • https?:\/\/www\.example\.com\/ : https?:\/\/www\.example\.com\/
  • ([^\/]+) : Match 1+ of any character that is not / and capture as group #1 ([^\/]+) :匹配任何不是/的字符的 1+ 并捕获为组 #1
  • \/posts\/ : Match /posts/ \/posts\/ :匹配/posts/
  • (\d+) : Match 1+ digits and capture as group #2 (\d+) :匹配 1+ 个数字并捕获为组 #2
  • \/?$ : Match an optional trailing / before end \/?$ : 在结束之前匹配一个可选的尾随/

If you don't have to validate that the string is in fact a URL then just split it on forward slashes.如果您不必验证字符串实际上是 URL ,那么只需将其拆分为正斜杠即可。

 var parts = `https://www.example.com/{section}/posts/{number}`.split(/\//); console.log(parts[3]); console.log(parts[5]);

If you "must" use regex match then:如果您“必须”使用正则表达式匹配,那么:

 var matches = `https://www.example.com/{section}/posts/{number}`.match(/.*\/(?<section>[^\/]+)\/posts\/(?<number>.+)/); console.log(matches.groups['section']); console.log(matches.groups['number']);

One of cause needs to retrieve this kind of path information just from an URL 's pathname via eg the named capturing groups of an accordingly written RegExp .原因之一需要通过例如相应编写的RegExp命名捕获组URLpathname中检索这种路径信息。

For the provided example the url's pathname will be...对于提供的示例,url 的路径名将是...

/FOOBARBAZ/posts/987

.., thus a regex which uses named capture groups does look like... ..,因此使用命名捕获组的正则表达式确实看起来像...

/\/(?<section>[^\/]+)\/posts\/(?<number>[^\/?#]+)/

... which reads like... ...读起来像...

  • \/(?<section>[^\/]+) ... match a single slash then capture any sequence of characters that do not equal a slash, and name this capture group section ... then... \/(?<section>[^\/]+) ... 匹配单个斜杠,然后捕获任何不等于斜杠的字符序列,并将此捕获组section命名为 ... 然后...
  • \/posts ... match a single slash and the sequence posts ... then... \/posts posts匹配单个斜杠和序列 post ... 然后...
  • \/(?<number>[^\/?#]+) ... match a single slash then capture any sequence of characters that are not equal to slash, question mark and hash, and name this capture group number . \/(?<number>[^\/?#]+) ... 匹配单个斜线,然后捕获不等于斜线、问号和 hash 的任何字符序列,并将此捕获组命名为number

 const { section, number } = new URL('https://www.example.com/FOOBARBAZ/posts/987').pathname.match(/\/(?<section>[^\/]+)\/posts\/(?<number>[^\/?#]+)/).groups; console.log({ section, number });
 .as-console-wrapper { min-height: 100%;important: top; 0; }

The same capturing approach without named groups does look like that...没有命名组的相同捕获方法确实看起来像那样......

 const [ section, number ] = new URL('https://www.example.com/FOOBARBAZ/posts/987').pathname.match(/\/([^\/]+)\/posts\/([^\/?#]+)/).slice(1); console.log({ section, number });
 .as-console-wrapper { min-height: 100%;important: top; 0; }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM