简体   繁体   中英

How to match patterns in one sentence using regex in python?

Here are 2 examples,

1. I need to take this apple. I just finished the first one.

2. I need to get some sleep. apple is not working.

I want to match the text with need and apple in the same sentence. By using need.*apple it will match both examples. But I want it works only for the first one. How do I change the code, or do we have other string methods in Python?

The comment posted by @ctwheels concerning splitting on . and then testing to see if if it contains apple and need is a good one not requiring the use of regular expressions. I would first, however, split again on white space and then test these words against the resulting list to ensure you do not match against applesauce . But here is a regex solution:

import re

text = """I need to take this apple. I just finished the first one.
I need to get some sleep. apple is not working."""

regex = re.compile(r"""
    [^.]*           # match 0 or more non-period characters
    (
        \bneed\b    # match 'need' on a word boundary
        [^.]*       # match 0 or more non-period characters
        \bapple\b   # match 'apple' on a word boundary
      |             # or
        \bapple\b   # match 'apple' on a word boundary
        [^.]*       # match 0 or more non-period characters
        \bneed\b    # match 'need' on a word boundary
    )
    [^.]*           # match 0 or more non-period characters
    \.              # match a period
    """, flags=re.VERBOSE)

for m in regex.finditer(text):
    print(m.group(0))

Prints:

I need to take this apple.

The problem with both of these solutions is if the sentence contains a period whose usage is for purposes other than ending a sentence, such as I need to take John Q. Public's apple. In this case you need a more powerful mechanism for dividing the text up into sentences. Then the regex that operates against these sentences, of course, becomes simpler but splitting on white space still seems to make the most sense.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM