简体   繁体   English

试图找到这种特殊情况的正则表达式? 我也可以在不创建组的情况下解析它吗?

[英]Trying to find the regex for this particular case? Also can I parse this without creating groups?

text to capture looks like this..要捕获的文本如下所示..

  Policy Number    ABCD000012345    other text follows in same line....

My regex looks like this我的正则表达式看起来像这样

 regex value='(?i)(?:[P|p]olicy\s[N|n]o[|:|;|,][\n\r\s\t]*[\na-z\sA-Z:,;\r\d\t]*[S|s]e\s*[H|h]abla\s*[^\n]*[\n\s\r\t]*|(?i)[P|p]olicy[\s\n\t\r]*[N|n]umber[\s\n\r\t]*)(?P<policy_number>[^\n]*)'

this particular case matches with the second or case.. however it is also capturing everything after the policy number.这种特殊情况与第二种情况或情况相匹配。但是它也捕获了保单编号之后的所有内容。 What can be the stopping condition for it to just grab the number.它只是抓住数字的停止条件是什么。 I know something is wrong but can't find a way out.我知道有些不对劲,但找不到出路。

 (?i)[P|p]olicy[\s\n\t\r]*[N|n]umber[\s\n\r\t]*)

current output当前 output

    ABCD000012345othertextfollowsinsameline....

expected output预计 output

   ABCD000012345

You may use a more simple regex, just finding from the beginning "[P|p]olicy\s*[N|n]umber\s*\b([AZ]{4}\d+)\b.*" and use the word boundary \b您可以使用更简单的正则表达式,只需从开头找到"[P|p]olicy\s*[N|n]umber\s*\b([AZ]{4}\d+)\b.*"和使用单词边界\b

pattern = re.compile(r"[P|p]olicy\s*[N|n]umber\s*\b([A-Z0-9]+)\b.*")
line = "Policy Number    ABCD000012345    other text follows in same line...."
matches = pattern.match(line)
id_res = matches.group(1)
print(id_res)  # ABCD000012345

And if there's always 2 words before you can use (?:\w+\s+){2}\b([A-Z0-9]+)\b.*如果在你可以使用之前总是有 2 个单词(?:\w+\s+){2}\b([A-Z0-9]+)\b.*


Also \s is for [\r\n\t\f\v ] so no need to repeat them, your [\n\r\s\t] is just \s\s[\r\n\t\f\v ]所以不需要重复它们,你的[\n\r\s\t]只是\s

you don't need the upper and lower case p and n specified since you're already specifying case insensitive.您不需要指定大小写pn ,因为您已经指定不区分大小写。

Also \s already covers \n , \t and \r .\s已经涵盖\n\t\r

(?i)policy\s+number\s+([A-Z]{4}\d+)\b

for verification purpose: Regex用于验证目的:正则表达式

Another Solution:另一个解决方案:

^[\s\w]+\b([A-Z]{4}\d+)\b

for verification purpose: Regex用于验证目的:正则表达式

I like this better, in case your text changes from policy number我更喜欢这个,以防您的文本从保单编号更改

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM