查找出现在字符串中单词之前的子字符串直到一个数字

Question

I have a string :我有一个字符串：

"abc mysql 23 rufos kanso engineer"

I want the regex to output the string before the word "engineer" till it sees a number.我希望正则表达式在“工程师”一词之前输出字符串，直到它看到一个数字。

That is the regex should output :那就是正则表达式应该输出：

23 rufos kanso

Another example:另一个例子：

String:细绳：

def grusol defno 1635 minos kalopo, ruso engineer okas puno"

I want the regex to output the string before the word "engineer" till it sees a number.我希望正则表达式在“工程师”一词之前输出字符串，直到它看到一个数字。

That is the regex should output :那就是正则表达式应该输出：

1635 minos kalopo, ruso

I am able to achieve this by a series of regex .我能够通过一系列 regex 来实现这一点。

Can I do this in one shot?我可以一次性完成吗？

Thanks谢谢

Answer 1

The pattern I'd use: ((\\d+)(?!.*\\d).*)engineer -- it looks for the latest digit and goes from there.我使用的模式： ((\\d+)(?!.*\\d).*)engineer -- 它查找最新的数字并从那里开始。

Something similar to (\\d.*)engineer would also work but only if there's only one digit in the string.类似于(\\d.*)engineer也可以使用，但前提是字符串中只有一位数字。

>>> import re
>>> string = '123 abc mysql 23 rufos kanso engineer'
>>> pattern = r'((\d+)(?!.*\d).*)engineer'
>>> re.search(pattern, string).group(1)
'23 rufos kanso '
>>>

Edit编辑

In case there are digits after the 'engineer' part, the pattern mentioned above does not work, as you have pointed out in the comment.如果“工程师”部分后面有数字，则上述模式不起作用，正如您在评论中指出的那样。 I tried to solve it, but honestly I couldn't come up with a new pattern (sorry).我试图解决它，但老实说我无法想出一个新的模式（抱歉）。

The workaround I could suggest is, assuming 'engineer' is still the 'key' word, splitting your initial string by said word.我可以建议的解决方法是，假设“工程师”仍然是“关键”词，将您的初始字符串按所述词分开。

Here is the illustration of what I mean:这是我的意思的插图：

>>> string = '123 abc mysql 23 rufos kanso engineer 1234 b65 de'
>>> string.split('engineer')
['123 abc mysql 23 rufos kanso ', ' 1234 b65 de']
>>> string.split('engineer')[0] 
'123 abc mysql 23 rufos kanso '

# hence, there would be no unexpected digits

>>> s = string.split('engineer')[0]
>>> pattern = r'((\d+)(?!.*\d).*)'
>>> re.search(pattern, s).group(1)
'23 rufos kanso '

Answer 2

Use positive look-ahead to match until the word engineer preceded by a digit.使用positive look-ahead来匹配直到前面有一个数字的工程师这个词。

The regex - (?=\\d)(.+)(?=engineer) The regex - (?=\\d)(.+)(?=engineer)

Just to get an idea:只是为了得到一个想法：

import re
pattern = r"(?=\d)(.+)(?=engineer)"
input = [ "\"def grusol defno 1635 minos kalopo, ruso engineer okas puno\"", "\"abc mysql 23 rufos kanso engineer\"" ]

matches = []

for item in input:
    matches.append(re.findall(pattern, item))

Outputting:输出：

[['1635 minos kalopo, ruso '], ['23 rufos kanso ']]

Answer 3

Have a look at this site .看看这个网站。 It is great to play around with regex and it explains every steps.玩正则表达式很棒，它解释了每个步骤。
Here is a solution to your problem: link这是您问题的解决方案：链接

查找出现在字符串中单词之前的子字符串直到一个数字

问题描述

3 个解决方案

解决方案1
0 已采纳 2019-07-11 07:16:03

Edit编辑

解决方案2
0 2019-07-11 07:18:27

解决方案3
0 2019-07-11 07:18:48

查找出现在字符串中单词之前的子字符串直到一个数字

问题描述

3 个解决方案

解决方案1 0 已采纳 2019-07-11 07:16:03

Edit编辑

解决方案2 0 2019-07-11 07:18:27

解决方案3 0 2019-07-11 07:18:48

解决方案1
0 已采纳 2019-07-11 07:16:03

解决方案2
0 2019-07-11 07:18:27

解决方案3
0 2019-07-11 07:18:48