[英]Searching for a pattern in a sentence with regex in python
I want to capture the digits that follow a certain phrase and also the start and end index of the number of interest.我想捕获某个短语后面的数字以及感兴趣的数字的开始和结束索引。
Here is an example:这是一个例子:
text = The special code is 034567 in this particular case and not 98675
In this example, I am interested in capturing the number 034657
which comes after the phrase special code
and also the start and end index of the the number 034657
.在此示例中,我有兴趣捕获短语
special code
之后的数字034657
以及数字034657
的开始和结束索引。
My code is:我的代码是:
p = re.compile('special code \s\w.\s (\d+)')
re.search(p, text)
But this does not match anything.但这不匹配任何东西。 Could you explain why and how I should correct it?
你能解释一下为什么以及我应该如何纠正它吗?
Use re.findall
with a capture group:将
re.findall
与捕获组一起使用:
text = "The special code is 034567 in this particular case and not 98675"
matches = re.findall(r'\bspecial code (?:\S+\s+)?(\d+)', text)
print(matches)
This prints:这打印:
['034567']
Your expression matches a space and any whitespace with \s
pattern, then \w.
您的表达式匹配空格和任何带有
\s
模式的空格,然后是\w.
matches any word char and any character other than a line break char, and then again \s
requires two whitespaces, any whitespace and a space.匹配任何单词 char 和除换行符以外的任何字符,然后
\s
再次需要两个空格,任何空格和一个空格。
You may simply match any 1+ whitespaces using \s+
between words, and to match any chunk of non-whitespaces, instead of \w.
您可以简单地在单词之间使用
\s+
匹配任何 1+ 个空格,并匹配任何非空格块,而不是\w.
, you may use \S+
. ,您可以使用
\S+
。
Use利用
import re
text = 'The special code is 034567 in this particular case and not 98675'
p = re.compile(r'special code\s+\S+\s+(\d+)')
m = p.search(text)
if m:
print(m.group(1)) # 034567
print(m.span(1)) # (20, 26)
See the Python demo and the regex demo .请参阅Python 演示和正则表达式演示。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.