简体   繁体   English

正则表达式匹配字符串python中的子字符串

[英]Regex match following substring in string python

I've come up with a regex expression that works well enough for my purposes for finding phone numbers. 我想出了一个正则表达式表达式,该表达式足以满足我查找电话号码的需要。

I would like to take it a step further and use it in large text blocks to identify matching strings that follow the words 'cell' or 'mobile' by at most 10 characters. 我想更进一步,将其用在较大的文本块中,以识别匹配单词“ cell”或“ mobile”后最多10个字符的字符串。 I would like it to return the number in Cell Phone: (954) 555-4444 as well as Mobile 555-777-9999 but not Fax: (555) 444-6666 我希望它返回Cell Phone: (954) 555-4444以及Mobile 555-777-9999但不返回Fax: (555) 444-6666

something like (in pseudocode) 类似于(用伪代码)

regex = re.compile(r'(\+?[2-9]\d{2}\)?[ -]?\d{3}[ -]?\d{4})')
bigstring = # Some giant string added together from many globbed files
matches = regex.search(bigstring)
for match in matches:
    if match follows 'cell' or match follows 'mobile':
        print match.group(0)

You can do: 你可以做:

txt='''\
Call me on my mobile anytime: 555-666-1212 
The office is best at 555-222-3333 
Dont ever call me at 555-666-2345 '''

import re

print re.findall(r'(?:(mobile|office).{0,15}(\+?[2-9]\d{2}\)?[ -]?\d{3}[ -]?\d{4}))', txt)

Prints: 打印:

[('mobile', '555-666-1212'), ('office', '555-222-3333')]

You can do that with your regular expression. 您可以使用正则表达式来做到这一点。 In the re documentation, you will find that the pattern r'(?<=abc)def' matches 'def' only if it is preceded by 'abc' . re文档中,您将发现模式r'(?<=abc)def' 'def'仅在其后跟'abc'时才与'def'匹配。

Similarly r'Hello (?=World)' matches 'Hello ' if followed by 'World' 同样,如果r'Hello (?=World)''Hello '匹配,则后面跟有'World'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM