简体   繁体   English

不包含字符的字符串的 Javascript 匹配

[英]Javascript match for string that doesn't contain a character

I'm trying to use Javascript match a url pattern containing one directory, with an optional trailing slash.我正在尝试使用 Javascript 匹配包含一个目录的 url 模式,并带有一个可选的尾部斜杠。

For example例如

This should match:这应该匹配:

text and http://twitter.com/path and more text文本和http://twitter.com/path以及更多文本

This should not match:这不应该匹配:

text and http://twitter.com/path /other/directories and more text文本和http://twitter.com/path /other/directories和更多文本

Even though the shorter string exists in the longer one, I don't want the longer one to return anything.即使较短的字符串存在于较长的字符串中,我也不希望较长的字符串返回任何内容。

Is this possible?这可能吗?


Here's what I've tried so far:这是我迄今为止尝试过的:

The approach was to match the url, and then either use a negative character class, or a negative lookback方法是匹配 url,然后使用否定字符类或否定回溯

I've tried the following:我尝试了以下方法:

/(https?:)?(\/\/)?(www\.)?twitter\.com\/[a-z0-9_+-]+\/?(?![a-z0-9_+-])/ig

This was meant to look for a twitter URL, with a \\w+ path, with optional trailing slash, not followed by any other \\w+ .这是为了寻找一个 twitter URL,带有一个\\w+路径,带有可选的斜杠,后面没有任何其他\\w+

While this doesn't include the second directory in its match, I wanted it to not match the string at all.虽然这不包括匹配中的第二个目录,但我希望它根本不匹配字符串。

/(https?:)?(\/\/)?(www\.)?twitter\.com\/\w+[^\/\w]*/ig

This was meant to find the URL, but exclude slashes and \\w following.这是为了查找 URL,但排除后面的斜杠和\\w Similar to the previous try, it still matches the long links.与之前的尝试类似,它仍然匹配长链接。

I've tried variations like this, but can't get it to work:我试过这样的变体,但不能让它工作:

 var regex1 = /(https?:)?(\\/\\/)?(www\\.)?twitter\\.com\\/\\w+(?!\\/\\w+)/ig; var regex2 = /(https?:)?(\\/\\/)?(www\\.)?twitter\\.com\\/\\w+[^\\/\\w]*/ig; var shouldMatch = 'text https://twitter.com/page text'; var shouldNotMatch = 'text https://twitter.com/page/status/123 text'; console.log('regex1 should match', shouldMatch.match(regex1)); console.log('regex1 should return []', shouldNotMatch.match(regex1)); console.log('regex2 should match', shouldMatch.match(regex2)); console.log('regex2 should return []', shouldNotMatch.match(regex2));

You could use a negative lookahead at the end of the valid part, asserting that the page name is not followed by either a / and a word character, or another word character.您可以在有效部分的末尾使用否定前瞻,断言页面名称后面没有/和单词字符,或另一个单词字符。 The addition of the other word character alternation to the negative lookahead prevents the regex otherwise matching at (for example) http://twitter.com/pag .将其他单词字符交替添加到负前瞻可防止正则表达式在(例如) http://twitter.com/pag处以其他方式匹配。

 var regex1 = /(https?:\\/\\/)?(www\\.)?twitter\\.com\\/\\w+(?!\\/\\w|\\w)/ig; var shouldMatch = 'https://twitter.com/page is a valid url'; var shouldNotMatch = 'https://twitter.com/page/status/123 is not valid'; console.log('regex1 should match', shouldMatch.match(regex1)); console.log('regex1 should return []', shouldNotMatch.match(regex1));

I would use positive lookahead to assert that there's either a whitespace character or the end of the string following the last group of \\w+ .我会使用正向前瞻来断言在最后一组\\w+有一个空格字符或字符串的结尾。

(?:https?:\\/\\/)?(?:www\\.)?twitter\\.com\\/\\w+(?=\\s|$)

Demo: https://regex101.com/r/EMHxq9/3演示: https : //regex101.com/r/EMHxq9/3

I also used non-capturing groups in place of capturing groups because it doesn't look like you're back referencing anything, and combined the optional "https:" and "//" parts because it would be weird to have one and not the other.我还使用了非捕获组代替了捕获组,因为它看起来不像你在引用任何东西,并结合了可选的“https:”和“//”部分,因为有一个而不是一个会很奇怪另一个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM