简体   繁体   English

在其他两个字符串之间找到某些东西之后找到一个字符串

[英]Find a string after two other strings with something between them

Let's go with an example: 让我们来看一个例子:

"Blablabla. My name is John and I'm 21 years old. Blablabla" “布拉布拉。我叫约翰,今年21岁。布拉布拉”

Other example: 其他例子:

"Blablabla. My name is John and I'm 21 years old. - Hi I'm Mary and I'm 22 years old." “布拉布拉。我叫约翰,今年21岁。-嗨,我是玛丽,今年22岁。”

Basically, I want to match the age of the first person (here, 21, it could be 23 or whatever). 基本上,我想匹配第一人称的年龄(在这里是21岁,可能是23岁或其他)。 The idea is that I know I'll have a sentence beginning with "My name is $name and I'm 21" but I can't afford to know what is $name. 我的想法是,我知道我会有一个以“我的名字是$ name并且我21岁”开头的句子,但是我不能负担得起$ name的含义。 The gross idea is to select a number after "My name is "+something+" and I'm ". 总的想法是在“我的名字是” + something +”而我是”之后选择一个数字。

How one would do that with a regex, knowing that I can't use catch groups? 知道我不能使用捕获组,用正则表达式怎么做?

What I have so far: 到目前为止,我有:

    (?<=<My name is )(.*)(?= years old)

Ideally I would like something like that to work: 理想情况下,我希望这样的工作:

    (?<=<My name is .* and I'm )(.*)(?= years old)

... but it does not! ...但事实并非如此! .* can't be in a look ahead group apparently (which makes some sense). 。*显然不能加入前瞻性小组(这很有道理)。

Thank you kindly. 非常感谢你。

/My name is (\w+) and I'm (\d+) years old./

Now the first matched group is the name, the second matched group is the age. 现在,第一个匹配的组是名称,第二个匹配的组是年龄。


If for some reason you don't want to use groups, you can match: 如果由于某种原因您不想使用组,则可以匹配:

/(?<= and I'm )\d+(?= years old.)/

for the name and: 名称和:

 /(?<= and I'm )\\d+(?= years old.)/ 

for the age. 对于这个年龄。


As you have noticed, lookbehinds with variable length are not allowed (at least in the regex engines that I know of, not that it is logically impossible). 如您所知,不允许使用长度可变的回退(至少在我所知道的正则表达式引擎中,从逻辑上讲这不是不可能的)。 However, you can use \\K as an alternative: 但是,您可以使用\\K作为替代方案:

 /My name is \\w+ and I'm \\K\\d+(?= years old.)/ 

@ndn's answer is basically correct, but I think it needs a couple of modifications: @ndn的答案基本上是正确的,但我认为它需要进行一些修改:

  1. The \\w+ expression will not find spaces, such as in "My name is Mary Kate and I'm 47 years old." \\w+表达式将找不到空格,例如“我叫玛丽·凯特 ,今年47岁。”
  2. If I'm interpret your request correctly that you need only the date to match, then I don't think the lookbehind and lookaround assertions that you and @ndn have set up are necessary. 如果我正确地解释了您的请求,即您只需要匹配日期,那么我认为您和@ndn设置的回溯和环顾断言就没有必要了。

I believe this regex will give you what you want: 我相信这个正则表达式会给您您想要的:
My name is .+? and I'm (\\d+) years old\\.

(Note the \\. at the end so it will match the literal period, rather than any character.) (请注意末尾的\\.使其与文字句号匹配,而不是任何字符。)

See example at https://regex101.com/r/nJ7wS5/1 参见示例https://regex101.com/r/nJ7wS5/1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM