I have a regex match requirement. I want to match a complete phrase instead of individual subtokens. here's an example
In [21]: re.findall(r"""don't|agree|don't agree""", "I don't agree to this", re.IGNORECASE)
Out[21]: ["don't", 'agree']
I want this to match "don't agree"
and not don't and agree
separately.
Any help.
Put the longest string before:
re.findall(r"don't agree|don't|agree", "I don't agree to this", re.IGNORECASE)
or use an optional group:
re.findall(r"don't(?: agree)?|agree", "I don't agree to this", re.IGNORECASE)
Use lookaround in your regex:
re.findall(r"""don't(?!\sagree)|(?<!don't\s)agree|don't agree""", "I don't agree to this", re.IGNORECASE)
^^^^^^^^^ ^^^^^^^^^^
Using negative lookahead (?!\\sagree)
it checking that no agree
after the don't
.
And using negative lookbehind (?<!don't\\s)
its checking that there is no don't
before the agree
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.