简体   繁体   English

正则表达式匹配一个单词但不在反引号内

[英]Regular expression to match a word but not inside backticks

Here is an example of different paragraphs这是不同段落的示例

Upgrade is the first word in this paragraph.
In this paragraph, upgrade is the last word.
And this paragraph endsupgrade with upgrade.
But I don't want to upgradefind that word in this command `gigalixir:upgrade`.

As you can see there are 6 instances of upgrade word in above mentioned four lines.如您所见,在上述四行中有 6 个升级词实例。 I am trying to find all the upgrade words except the one at the last (because that word is inside the command inside backticks).我试图找到除最后一个之外的所有升级词(因为该词在反引号内的命令内)。 I also do not want to find upgrade words that are not independent.我也不想找到不是独立的升级词。

So in the above sentence following words marked with double * should be selected:所以在上面的句子中,以下双*标记的单词应该被选中:

**Upgrade** is the first word in this paragraph.
In this paragraph, **upgrade** is the last word.
And this paragraph endsupgrade with **upgrade**.
But I don't want to upgradefind that word in this command `gigalixir:upgrade`.

I have tried this simple regex:我试过这个简单的正则表达式:

/\bupgrade\b/gi

That selects all the independent words but I want to ignore the upgrade word inside backticks.这选择了所有独立的词,但我想忽略反引号内的升级词。

Note: I do not want to use lookahead or lookbehind, because I am executing this regex inside browser and any browser except chrome does not support that.注意:我不想使用前瞻或后视,因为我在浏览器中执行这个正则表达式,除了 chrome 之外的任何浏览器都不支持。

You can match strings inside backticks and skip them and only match a word upgrade in all other contexts as a whole word:您可以匹配反引号内的字符串并跳过它们,并且仅upgrade所有其他上下文中的单词upgrade匹配为整个单词:

 const text = 'Upgrade is the first word in this paragraph.\\nIn this paragraph, upgrade is the last word.\\nAnd this paragraph endsupgrade with upgrade.\\nBut I don\\'t want to upgradefind that word in this command `gigalixir:upgrade`.'; const regex = /(`[^`]*`)|\\bupgrade\\b/gi; console.log(text.replace(regex, (x,y) => y || `**${x}**`));

The (`[^`]*`)|\\bupgrade\\b regex matches (`[^`]*`)|\\bupgrade\\b正则表达式匹配

  • (`[^`]*`) - Capturing group 1 (it will help analyze the match structure later): a backtick, zero or more chars other than a backtick, and a backtick (`[^`]*`) - 捕获第 1 组(有助于稍后分析匹配结构):反引号、除反引号之外的零个或多个字符以及反引号
  • | - or - 或者
  • \\bupgrade\\b - a whole word upgrade (case insensitively due to i flag). \\bupgrade\\b - 全字upgrade (由于i标志不区分大小写)。

The .replace(regex, (x,y) => y || `**${x}**`) means that after a match is found, the match is passed to an arrow function where x is the whole match and y is the Group 1 value. .replace(regex, (x,y) => y || `**${x}**`)表示在找到匹配项后,将匹配项传递给箭头函数,其中x是整个匹配项,并且y是第 1 组的值。 If Group 1 value matches, its value is used to replace the match, else, the whole match is wrapped with double asterisks.如果第 1 组值匹配,则使用其值替换匹配项,否则,整个匹配项用双星号包裹。

Alternatively, you may use a known workaround with a negative lookahead that will work only in case you have a paired amount of backticks in the string:或者,您可以使用已知的具有负前瞻的解决方法,该方法仅在字符串中有成对数量的反引号时才有效:

\bupgrade\b(?=(?:[^`]*`[^`]*`)*[^`]*$)

See the regex demo .请参阅正则表达式演示

The (?=(?:[^`]*`[^`]*`)*[^`]*$) lookahead matches a location that is immediately followed with (?=(?:[^`]*`[^`]*`)*[^`]*$)前瞻匹配紧跟其后的位置

  • (?:[^`]*`[^`]*`)* zero or more repetitions of any zero or more chars other than a backtick, followed with a backtick, then again any zero or more chars other than a backtick and again a backtick (?:[^`]*`[^`]*`)*零个或多个重复除反引号之外的任何零个或多个字符,然后是反引号,然后是除反引号之外的零个或多个字符,然后再重复一次反引号
  • [^`]* - any zero or more chars other than a backtick [^`]* - 除反引号外的任何零个或多个字符
  • $ - end of string. $ - 字符串的结尾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM