简体   繁体   English

Python 正则表达式匹配关键字的所有变体,除非前面有大写单词

[英]Python Regex to match a all variations of a keyword except if preceded by a capitalized word

I'm looking for a Python Regex to match a all variations of a keyword except if preceded by a capitalized word -> except when that capitalized word is the start of a sentence.我正在寻找一个 Python 正则表达式来匹配关键字的所有变体,除非前面有一个大写单词 - >除非那个大写单词是句子的开头。 Also excludes words between brackets.也排除括号之间的单词。

for example:例如:

keyword = 'public record'
string1 = 'Hello. His public records are available at city hall.' #match public records His is the start of a sentence so we ignore that it is capitalized and match
string2 = 'his records are at Newsom Public Record DataBase'      #nomatch
string3 = 'Public records may be available online'                #match Public records
string4 = '[public records](http:/....)'                          #nomatch

So far I have tried:到目前为止,我已经尝试过:

pattern = f'(?<!\[)(?i)\\w*{keyword}\\w*'   #Doesn't  take into account preceding capitalized words
pattern = f'(?<![A-Z][\w-]\s)(?<!\[)(?i)\\w*{keyword}\\w*' #Doesn't work for cap words > 2 chara

You can specify the various allowed beginnings, ie start of sentence + cap word, non-cap word or beginning of string, and then assert that the keyword follows with a lookahead:您可以指定各种允许的开头,即句子开头 + 大写单词、非大写单词或字符串开头,然后断言关键字后跟前瞻:

pattern = r'(\. [A-Z]\w* |\W[^A-Z]\w* |^)(?=[pP]ublic [rR]ecord)'

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python正则表达式:匹配所有连续的大写单词 - Python regex: Match ALL consecutive capitalized words 匹配字符串中的任何单词,除了在python中以大括号开头的单词 - Match any word in string except those preceded by a curly brace in python Python Regex仅匹配每个单词大写的位置 - Python Regex match only where every word is capitalized Python 正则表达式 - 替换所有出现的未以特定字符开头的关键字 - Python regex - substitute all occurrences of keyword not preceded by specific characters Python Regex:匹配前面或后面没有带数字的单词的字符串 - Python Regex: Match a string not preceded by or followed by a word with digits in it Python中的正则表达式:如果没有另一个可变长度的单词,该如何匹配单词模式? - Regex in Python: How to match a word pattern, if not preceded by another word of variable length? 尝试正则表达式所有大写单词,除了那些紧跟在 Python 中的单词 - Trying to regex all capitalized words EXCEPT those immediately following a period in Python 正则表达式匹配的词不是紧跟在另一个词之前,而是可能在该词之前 - Regex match word not immediately preceded by another word but possibly preceded by that word before 正则表达式:匹配IP地址,除非前面有某些字符? - Regex: Match IP address except when preceded by certain characters? python findall匹配全部大写的字符串 - python findall to match the string that are all capitalized
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM