简体   繁体   English

已知短语前后的正则表达式条件

[英]Regex condition after and before a known phrase

I am trying to capture a phrase that starts with a Capital letter between 2 known phrase. 我正在尝试捕获一个以2个已知短语之间的大写字母开头的短语。 Let's say between "Known phrase, " and the word "The". 让我们在“已知短语”和“ The”一词之间说。

For example in the text below, the phrase I'm trying to capture is: Stuff TO CApture That always start with Capital letter but stop capturing when 例如,在下面的文本中,我要捕获的短语是:要捕获的Stuff TO CApture That always start with Capital letter but stop capturing when

Ignore Words Known phrase, ignore random phrase Stuff TO CApture That always start with Capital letter but stop capturing when The appears. 忽略单词已知短语,忽略随机短语填充对象始终以大写字母开头,但在出现时停止捕获。

Regex I have tried: (?<=Known phrase, ).*(?= The) and Known phrase, (.*) The These regex also captures ignore random phrase . 我尝试过的正则表达式: (?<=Known phrase, ).*(?= The)Known phrase, (.*) The这些正则表达式还捕获ignore random phrase How do I ignore this? 我该如何忽略呢?

I guess as regular expression is left side greedy you should first try to match anything that is not capital letters 我猜因为正则表达式是左侧贪婪,所以您应该首先尝试匹配不是大写字母的任何东西

Something like /Start[^AZ]*(.*)stop/ ( [^AZ] matches anything that is not capital letter) 类似于/ /Start[^AZ]*(.*)stop/ [^AZ] /Start[^AZ]*(.*)stop/ (。*) /Start[^AZ]*(.*)stop/[^AZ]匹配非大写字母的任何内容)

regex101 demo regex101演示

For your exaple data, you might use: 对于您的示例数据,您可以使用:

Known phrase, [az ]+([AZ].*?) The

See the regex demo 正则表达式演示

Explanation 说明

  • Known phrase, Match literally Known phrase,匹配
  • [az ]+ Match 1+ times a lowercase character or a space (Add to the character class what you would allow to match except an Uppercase character) [az ]+匹配1+乘以小写字符或空格(将大写字符除外的字符添加到字符类中)
  • ([AZ].*?) Capture in a group matching an uppercase character followed by 0+ times any character except a newline. ([AZ].*?)捕获与大写字符匹配的组,后跟0+次除换行符以外的任何字符。
  • The Match literally The赛事从字面上

I'm not sure of what you are trying to do, but, trying to stick with your code, (?<=Known phrase, )([^AZ]*)(.*)(?=The) should do the trick: the text you need is in the group 2. 我不确定您要做什么,但是,尝试坚持使用您的代码, (?<=Known phrase, )([^AZ]*)(.*)(?=The)应该可以解决问题:您需要的文本在第2组中。
If you need to match everything just change to (.*)(?<=Known phrase, )([^AZ]*)(.*)(?=The)(.*) and get your text in group 3. 如果您需要匹配所有内容,只需将其更改为(.*)(?<=Known phrase, )([^AZ]*)(.*)(?=The)(.*)并在第3组中获取您的文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM