[英]Javascript match for string that doesn't contain a character
I'm trying to use Javascript match a url pattern containing one directory, with an optional trailing slash.我正在尝试使用 Javascript 匹配包含一个目录的 url 模式,并带有一个可选的尾部斜杠。
For example例如
This should match:这应该匹配:
text and http://twitter.com/path and more text
文本和http://twitter.com/path以及更多文本
This should not match:这不应该匹配:
text and http://twitter.com/path /other/directories and more text
文本和http://twitter.com/path /other/directories和更多文本
Even though the shorter string exists in the longer one, I don't want the longer one to return anything.即使较短的字符串存在于较长的字符串中,我也不希望较长的字符串返回任何内容。
Is this possible?这可能吗?
Here's what I've tried so far:这是我迄今为止尝试过的:
The approach was to match the url, and then either use a negative character class, or a negative lookback方法是匹配 url,然后使用否定字符类或否定回溯
I've tried the following:我尝试了以下方法:
/(https?:)?(\/\/)?(www\.)?twitter\.com\/[a-z0-9_+-]+\/?(?![a-z0-9_+-])/ig
This was meant to look for a twitter URL, with a \\w+
path, with optional trailing slash, not followed by any other \\w+
.这是为了寻找一个 twitter URL,带有一个
\\w+
路径,带有可选的斜杠,后面没有任何其他\\w+
。
While this doesn't include the second directory in its match, I wanted it to not match the string at all.虽然这不包括匹配中的第二个目录,但我希望它根本不匹配字符串。
/(https?:)?(\/\/)?(www\.)?twitter\.com\/\w+[^\/\w]*/ig
This was meant to find the URL, but exclude slashes and \\w
following.这是为了查找 URL,但排除后面的斜杠和
\\w
。 Similar to the previous try, it still matches the long links.与之前的尝试类似,它仍然匹配长链接。
I've tried variations like this, but can't get it to work:我试过这样的变体,但不能让它工作:
var regex1 = /(https?:)?(\\/\\/)?(www\\.)?twitter\\.com\\/\\w+(?!\\/\\w+)/ig; var regex2 = /(https?:)?(\\/\\/)?(www\\.)?twitter\\.com\\/\\w+[^\\/\\w]*/ig; var shouldMatch = 'text https://twitter.com/page text'; var shouldNotMatch = 'text https://twitter.com/page/status/123 text'; console.log('regex1 should match', shouldMatch.match(regex1)); console.log('regex1 should return []', shouldNotMatch.match(regex1)); console.log('regex2 should match', shouldMatch.match(regex2)); console.log('regex2 should return []', shouldNotMatch.match(regex2));
You could use a negative lookahead at the end of the valid part, asserting that the page name is not followed by either a /
and a word character, or another word character.您可以在有效部分的末尾使用否定前瞻,断言页面名称后面没有
/
和单词字符,或另一个单词字符。 The addition of the other word character alternation to the negative lookahead prevents the regex otherwise matching at (for example) http://twitter.com/pag
.将其他单词字符交替添加到负前瞻可防止正则表达式在(例如)
http://twitter.com/pag
处以其他方式匹配。
var regex1 = /(https?:\\/\\/)?(www\\.)?twitter\\.com\\/\\w+(?!\\/\\w|\\w)/ig; var shouldMatch = 'https://twitter.com/page is a valid url'; var shouldNotMatch = 'https://twitter.com/page/status/123 is not valid'; console.log('regex1 should match', shouldMatch.match(regex1)); console.log('regex1 should return []', shouldNotMatch.match(regex1));
I would use positive lookahead to assert that there's either a whitespace character or the end of the string following the last group of \\w+
.我会使用正向前瞻来断言在最后一组
\\w+
有一个空格字符或字符串的结尾。
(?:https?:\\/\\/)?(?:www\\.)?twitter\\.com\\/\\w+(?=\\s|$)
Demo: https://regex101.com/r/EMHxq9/3演示: https : //regex101.com/r/EMHxq9/3
I also used non-capturing groups in place of capturing groups because it doesn't look like you're back referencing anything, and combined the optional "https:" and "//" parts because it would be weird to have one and not the other.我还使用了非捕获组代替了捕获组,因为它看起来不像你在引用任何东西,并结合了可选的“https:”和“//”部分,因为有一个而不是一个会很奇怪另一个。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.