正则表达式匹配的词不是紧跟在另一个词之前，而是可能在该词之前

Question

I need to match all strings that contain one word of a list, but only if that word is not immediately preceded by another specific word.我需要匹配包含列表中一个单词的所有字符串，但前提是该单词之前没有紧跟另一个特定单词。 I have this regex:我有这个正则表达式：

.*(?<!forbidden)\b(word1|word2|word3)\b.*

that is still matching a sentence like hello forbidden word1 because forbidden is matched by .* .那仍然匹配一个句子，比如hello forbidden word1因为forbidden匹配.* 。 But if I remove the .* I am not anymore matching strings like hello word1 , which I want to match.但是，如果我删除.*我不再匹配我想要匹配的hello word1之类的字符串。

Note that I want to match a string like forbidden hello word1 .请注意，我想匹配一个像forbidden hello word1这样的字符串。

Could you suggest me how to fix this problem?你能建议我如何解决这个问题吗？

Answer 1

This one seems to work well :这个似乎运作良好：

^.*\b(?!(?:forbidden|word[1-3])\b)\w+ (word[1-3]).*$

\b(?!(?:forbidden|word[1-3])\b)\w+ checks for multiple following words that are not forbidden or word[1-3] . \b(?!(?:forbidden|word[1-3])\b)\w+检查后面的多个未被forbidden的单词或word[1-3] 。

So it matches hi forbidden hello word1 test but not hi hello forbidden word2 test .所以它匹配hi forbidden hello word1 test但不匹配 hi hi hello forbidden word2 test 。

Answer 2

If what you want is match entire string.如果你想要的是匹配整个字符串。 Try this:尝试这个：

Regex test正则表达式测试

The knowledge is from this thread Regular expression to match a line that doesn't contain a word知识来自this thread 正则表达式匹配不包含单词的行

I've just reversed the order of look-around我刚刚颠倒了环顾的顺序

((?<!forbidden )\b(word1|word2|word3)\b) is what you defined ((?<!forbidden )\b(word1|word2|word3)\b)是你定义的

But I just can't understand why do you need this requirement.但我就是不明白你为什么需要这个要求。

Answer 3

Have a look into word boundaries \bword can never touch a word character to the left.查看单词边界\bword永远不会触及左侧的单词字符。

To disallow (word1|word2|word3) if not preceded by forbidden and禁止(word1|word2|word3)如果前面没有forbidden和

one \W ( non word character )一个\W （非单词字符）
```
 ^.*?\b(?<!forbidden\W)(word1|word2|word3)\b.*
```
See this demo at regex101在 regex101 看到这个演示
multiple \W多个\W
Lookbehinds need to be of fixed length in Python regex.在 Python 正则表达式中，Lookbehinds 的长度必须是固定的。 To get around this, an idea is to use \W* outside preceded by (?<!\W) for setting the position to look behind.为了解决这个问题，一个想法是在(?<!\W)前面使用\W*来设置向后看的位置。
```
 ^.*?(?<!forbidden)(?<!\W)\W*\b(word1|word2|word3)\b.*
```
Regex101 demo (in multiline demo I used [^\w\n] instead \W for not skipping over lines) Regex101 演示（在多行演示中，我使用[^\w\n]而不是\W来不跳过行）
Certainly variable-width lookbehind, such as (?<!forbidden\W+) would be more comfortable.当然，可变宽度的后视，例如(?<!forbidden\W+)会更舒服。 PyPI Regex > import regex AS re supports lookbehind of variable length: See this demo PyPI Regex > import regex AS re支持后视可变长度：请参阅此演示

Note : If you do not capture anything, a (?: non-capturing groups can be used as well.注意：如果您不捕获任何内容，也可以使用(?: 非捕获组。

正则表达式匹配的词不是紧跟在另一个词之前，而是可能在该词之前

问题描述

3 个解决方案

解决方案1
1 2022-06-24 09:45:48

解决方案2
1 2022-06-24 09:54:48

解决方案3
1 已采纳 2022-06-24 11:16:37

正则表达式匹配的词不是紧跟在另一个词之前，而是可能在该词之前

问题描述

3 个解决方案

解决方案1 1 2022-06-24 09:45:48

解决方案2 1 2022-06-24 09:54:48

解决方案3 1 已采纳 2022-06-24 11:16:37

解决方案1
1 2022-06-24 09:45:48

解决方案2
1 2022-06-24 09:54:48

解决方案3
1 已采纳 2022-06-24 11:16:37