Searching for a pattern in a sentence with regex in python

Question

I want to capture the digits that follow a certain phrase and also the start and end index of the number of interest.

Here is an example:

text = The special code is 034567 in this particular case and not 98675

In this example, I am interested in capturing the number 034657 which comes after the phrase special code and also the start and end index of the the number 034657 .

My code is:

p = re.compile('special code \s\w.\s (\d+)')
re.search(p, text)

But this does not match anything. Could you explain why and how I should correct it?

Answer 1

Use re.findall with a capture group:

text = "The special code is 034567 in this particular case and not 98675"
matches = re.findall(r'\bspecial code (?:\S+\s+)?(\d+)', text)
print(matches)

This prints:

['034567']

Answer 2

Your expression matches a space and any whitespace with \s pattern, then \w. matches any word char and any character other than a line break char, and then again \s requires two whitespaces, any whitespace and a space.

You may simply match any 1+ whitespaces using \s+ between words, and to match any chunk of non-whitespaces, instead of \w. , you may use \S+ .

Use

import re
text = 'The special code is 034567 in this particular case and not 98675'
p = re.compile(r'special code\s+\S+\s+(\d+)')
m = p.search(text)
if m:
    print(m.group(1)) # 034567
    print(m.span(1))  # (20, 26)

See the Python demo and the regex demo .

Searching for a pattern in a sentence with regex in python

Question

2 answers

solution1
0 2020-05-20 09:07:00

solution2
0 ACCPTED 2020-05-20 09:19:14

Searching for a pattern in a sentence with regex in python

Question

2 answers

solution1 0 2020-05-20 09:07:00

solution2 0 ACCPTED 2020-05-20 09:19:14

solution1
0 2020-05-20 09:07:00

solution2
0 ACCPTED 2020-05-20 09:19:14