I am trying to check if a certain string contains an acronym using regex.
my current regex:
re.search(r'\b[A-Z]{3}', string)
currently it outputs true to USA
, NYCs
, and NSFW
but it should not say true on NSFW
because it is a four letter acronym, not three.
How can I readjust the regex to make it not accept NSFW
, but still accept NYCs
EDIT: it should also accept NYC,
A negative lookahead assertion: (?!pattern)
re.search(r'\\b[AZ]{3}(?![AZ])',string)
This requires the triple capital pattern to never be followed by another capital letter, while it doesn't imply other restrictions, like the pattern necessarily be followed by something. Think "Not followed by P" vs "Followed by not P"
Try:
filter(re.compile(r'\b[A-Z]{3}(?![A-Z])').search, ['.ANS', 'ANSs', 'AANS', 'ANS.'])
>>> import re
>>> rexp = r'(?:\b)([A-Z]{3})(?:$|[^A-Z])'
>>> re.search(rexp, 'USA').groups()
('USA',)
>>> re.search(rexp, 'NSFW') is None
True
>>> re.search(rexp, 'aUSA') is None
True
>>> re.search(rexp, 'NSF,').groups()
('NSF',)
You can use the ? to mean a character is optional, {0,1} would be equivalent.
You can put whatever characters you want to match inside the square brackets [ ] it will match any one of those 0 or 1 times so NYC. or WINs or FOO, will match.
Add the $ to the end to specify no more characters after the match are allowed
re.search(r'\b[A-Z]{3}[s,.]?$', string)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.