使用 JavaScript 在 Regex 上排除行首或行尾的匹配项

Question

I'm trying to define a regular expression in JavaScript that matches all ocurrences, excluding the ones on the beginning or on the end of a line.我试图在 JavaScript 中定义一个匹配所有出现的正则表达式，不包括行首或行尾的正则表达式。

I can exclude the ones on the beginning but not on the end.我可以排除开头的那些，但不能排除结尾的。 For example:例如：

const MULTILINE = `
Lorem ipsum dolor sit amet ANNA
ANNA lorem ipsum dolor sit amet
Lorem ipsum dolor ANNA sit amet
`

MULTILINE.match(/ANNA\w+/gm)
// output: ["ANNA_END", "ANNA_BEGIN", "ANNA_MIDDLE"] ok

MULTILINE.match(/(?!^)ANNA\w+/gm))
// output: ["ANNA_END", "ANNA_MIDDLE"] ok

MULTILINE.match(/ANNA\w+(?!$)/gm))
// output: ["ANNA_EN", "ANNA_BEGIN", "ANNA_MIDDLE"] fail
// expected: ["ANNA_BEGIN", "ANNA_MIDDLE"]

As seen, it correctly identifies my last string, but extracts the last character (as if $ was being replaced by another \\d expression).正如所见，它正确识别了我的最后一个字符串，但提取了最后一个字符（好像 $ 被另一个 \\d 表达式替换）。

I've read lots of documentation an tried several variations such as MULTILINE.match(/ANNA\\w+(?!ANNA\\w+$)/gm)) but without success.我已经阅读了大量文档并尝试了多种变体，例如MULTILINE.match(/ANNA\\w+(?!ANNA\\w+$)/gm))但没有成功。

Any help here?这里有什么帮助吗？ :) :)

Answer 1

The ANN_END returns ANN_EN match because (?!$) lookahead, when failing, makes the engine backtrack, and as the pattern right before (?!$) is \\w+ , a + quantified pattern, the backtracking enables a match to complete before the end of string. ANN_END返回ANN_EN匹配，因为(?!$)前瞻，失败时，使引擎回溯，并且由于(?!$)之前的模式是\\w+ ，一个+量化模式，回溯使匹配能够在字符串的结尾。 See this demo and pay attention at the red arrow that show backtracking at Step 9:查看此演示并注意显示第 9 步回溯的红色箭头：

To disallow this partial word matching, you may add a word boundary, \\b , or another lookahead, (?!\\w) .要禁止这种部分单词匹配，您可以添加单词边界\\b或另一个前瞻(?!\\w) 。

The complete solution to match ANNA\\w+ not at the start/end of the string will look like匹配ANNA\\w+不在字符串的开头/结尾的完整解决方案将如下所示

/(?!^)\bANNA\w+\b(?!$)/gm

See the regex demo .请参阅正则表达式演示。

Details细节

(?!^) - a negative lookahead that fails the match if the regex index is at the start of the string (?!^) - 如果正则表达式索引位于字符串的开头，则匹配失败的负前瞻
\\b - a word boundary \\b - 单词边界
ANNA - a substring ANNA - 一个子串
\\w+ - one or more word chars \\w+ - 一个或多个单词字符
\\b - a word boundary \\b - 单词边界
(?!$) - a negative lookahead that fails the match if the regex index is at the end of the string. (?!$) - 如果正则表达式索引位于字符串的末尾，则匹配失败的负前瞻。

JS demo: JS演示：

 const MULTILINE = `Lorem ipsum dolor sit amet ANNA_END ANNA_BEGIN lorem ipsum dolor sit amet Lorem ipsum dolor ANNA_MIDDLE sit amet`; console.log(MULTILINE.match(/(?!^)\\bANNA\\w+\\b(?!$)/gm));

使用 JavaScript 在 Regex 上排除行首或行尾的匹配项

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-03-15 21:29:00

使用 JavaScript 在 Regex 上排除行首或行尾的匹配项

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-03-15 21:29:00

解决方案1
1 已采纳 2019-03-15 21:29:00