简体   繁体   中英

Python matching greedy phrase search.

I have a regex match requirement. I want to match a complete phrase instead of individual subtokens. here's an example

In [21]: re.findall(r"""don't|agree|don't agree""", "I don't agree to this", re.IGNORECASE)
Out[21]: ["don't", 'agree']

I want this to match "don't agree" and not don't and agree separately.

Any help.

Put the longest string before:

re.findall(r"don't agree|don't|agree", "I don't agree to this", re.IGNORECASE)

or use an optional group:

re.findall(r"don't(?: agree)?|agree", "I don't agree to this", re.IGNORECASE)

Use lookaround in your regex:

re.findall(r"""don't(?!\sagree)|(?<!don't\s)agree|don't agree""", "I don't agree to this", re.IGNORECASE)
                     ^^^^^^^^^   ^^^^^^^^^^

Using negative lookahead (?!\\sagree) it checking that no agree after the don't .

And using negative lookbehind (?<!don't\\s) its checking that there is no don't before the agree

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM