简体   繁体   中英

Regex search pattern to extract string with 2 words limit

Looking to build some pattern using my_string as starting point to extract from my_string till Inc. ie Papa Johns some_text Inc

my_string_1 = 'Papa Johns'

my_string_1 = 'Inc.'

Need to search in any of the below sentences.

sent_1 = The company Papa Johns Retail Chain Inc. sells pizza, pastas etc.

sent_2 = The company Papa Johns Retail Chain., Inc. sells pizza, pastas etc.

sent_3 = The company Papa Johns Retail Chain, Inc. sells pizza, pastas etc.

sent_4 = The company Papa Johns Retail., Chain, Inc. sells pizza, pastas etc.

sent_5 = The company Papa Johns Retail, Inc. sells pizza, pastas etc.

I built a pattern pattern = '''Papa Johns (.{,30})Inc.''' and also this is working fine.

Is this possible if I do not use 30 chars condition but use 2 words limit (may be space split) to extract the required for all sentences.

You could use the pattern:

\bPapa Johns(?: \S+){0,2} Inc\.

This matches Papa Johns... Inc. with at most 2 words in between.

Python script:

inp = ["The company Papa Johns Retail Chain Inc. sells pizza, pastas etc.", "The company Papa Johns New Retail Chain Inc. sells pizza, pastas etc."]
for i in inp:
    if re.search(r'\bPapa Johns(?: \S+){0,2} Inc\.', i):
        print("MATCH:    " + i)
    else:
        print("NO MATCH: " + i)

This prints:

MATCH:    The company Papa Johns Retail Chain Inc. sells pizza, pastas etc.
NO MATCH: The company Papa Johns New Retail Chain Inc. sells pizza, pastas etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM