简体   繁体   English

正则表达式在第三个斜杠之后但在减号之前获取第一个单词

[英]Regex to get first word after third slash but before minus symbol

Here is one of the URL's I have to deal with...这是我必须处理的 URL 之一...

https://www.some-domain.de/city/123/street-firstname-lastname

I need to get the street using JS.我需要使用 JS上街 The domain stays the same, the city stays the same but the number 123 can vary.域保持不变,城市保持不变,但数字123可能会有所不同。

I managed to get some other values so far but completely lost on how to get the street.到目前为止,我设法获得了一些其他价值,但完全不知道如何获得街道。 Any help is appreciated.任何帮助表示赞赏。

I'd suggest using some existing library ( like this? ) to parse the url and isolate the path .我建议使用一些现有的库(像这样? )来解析 url 并隔离路径 Then split on '/' to get path segments (or maybe the url-parser you use will have an option for this).然后在'/'上拆分以获取路径段(或者您使用的 url-parser 可能会有一个选项)。 Once you have the single path segment you want, you could split on '-' , or use a regex, whichever.一旦你有了你想要的单个路径段,你可以在'-'上拆分,或者使用正则表达式,无论哪个。

The advantage of this is not just that you won't have to write a large complex regex yourself, it'll also free a lot of your assumptions about the input data and give a more reliable solution.这样做的好处不仅在于您不必自己编写大型复杂的正则表达式,它还将释放您对输入数据的许多假设并提供更可靠的解决方案。

var citynumber = document.location.href.split('/')[4];

var address = document.location.href.split('/')[5];
var street= address.split('-')[0]

Found it with the help of @ShapeOfMatter.在@ShapeOfMatter 的帮助下找到它。 I just picked everything after the last slash, tne split it up using "-" this time and not "/" and then picked the first word.我刚刚选择了最后一个斜线之后的所有内容,这次使用“-”而不是“/”将其拆分,然后选择了第一个单词。 Thank you very much.非常感谢。

The following regex validates the url-pattern and captures the street name...以下正则表达式验证 url 模式并捕获街道名称...

/^https*:\/\/(?:[^\/]+\/){3}([^-]+)/

It reads...它读...

  • ^https*:\/\/ ... match from the beginning either of the protocols http:// or https:// ^https*:\/\/ ... 从头开始匹配协议http://https://
  • (?:[^\/]+\/){3} ... match the following pattern... (?:[^\/]+\/){3} ...匹配以下模式...
    • [^\/]+\/ ... a sequence of any characters that are not a slash terminated/followed by a slash. [^\/]+\/ ... 任何不是斜杠的字符序列,后跟斜杠。
    • do not capture this group... (?: ... )不要捕获这个组... (?: ... )
    • but the pattern has to be repeated exactly 3 times... {3} .但该模式必须准确重复 3 次... {3}
  • ([^-]+) ... capture a sequence of any characters that are not a minus/dash, which is the street name . ([^-]+) ... 捕获任何不是减号/破折号的字符序列,即街道名称

 const regXCaptureStreet = (/^https*:\/\/(?:[^\/]+\/){3}([^-]+)/gm); const sampleText = `https://some-domain.de/city/123/street-firstname-lastname //foo/bar/baz/biz/ http://example.org/the-city/987654/mystreetname-myfirstname-mylastname //foo/bar/baz/biz/ https://some-domain.de/city/123/streetname-firstname-lastname //foo/bar/baz/biz/ http://example.org/the-city/987654/nameofstreet-myfirstname-mylastname`; console.log( [...sampleText.matchAll(regXCaptureStreet)].map(result => result[1]) )
 .as-console-wrapper { min-height: 100%;important: top; 0; }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM