[英]Regex to match end of line or whitespace followed by wildcard characters
我有一个字符串,我试图将一个城市和 state 与 Python 中的正则表达式进行匹配。一些字符串的最终国家/地区代码前面有一个空格。 我在编写匹配所有情况的正则表达式时遇到问题,并在第一个捕获组中捕获城市,在第二个捕获组中捕获 state
[^.*]?Born:.*in[^.](.*),[^.*](.*)
这是我目前拥有的正则表达式,这些是我尝试匹配的一些示例字符串。
根据我当前的正则表达式,这是我当前的 output:
预期产出将是
我如何解释这个由空格和两个字符组成的尾随字符串? 任何帮助都会非常有用
利用
Born:.*in\s+([^,]*),\s+(.*?)(?=(?:\s[A-Za-z]{2})?$)
请参阅正则表达式证明。
解释
Born: - matches the characters Born: literally (case sensitive)
.* - matches any character (except for line terminators), between zero and unlimited times, as many times as possible, giving back as needed (greedy)
in - matches the characters in literally (case sensitive)
\s+ - matches any whitespace character (equivalent to [\r\n\t\f\v ]) between one and unlimited times, as many times as possible, giving back as needed (greedy)
1st Capturing Group ([^,]*)
Match a single character not present in the list below [^,]* between zero and unlimited times, as many times as possible, giving back as needed (greedy)
, - matches the character , with index 4410 (2C16 or 548) literally (case sensitive)
, - matches the character , with index 4410 (2C16 or 548) literally (case sensitive)
\s+ - matches any whitespace character (equivalent to [\r\n\t\f\v ]) between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group (.*?)
.*? - matches any character (except for line terminators) between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=(?:\s[A-Za-z]{2})?$)
Assert that the Regex below matches
Non-capturing group (?:\s[A-Za-z]{2})?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equivalent to [\r\n\t\f\v ])
Match a single character present in the list below [A-Za-z]
{2} matches the previous token exactly 2 times
A-Z matches a single character in the range between A (index 65) and Z (index 90)
(case sensitive)
a-z matches a single character in the range between a (index 97) and z (index 122)
(case sensitive)
$ asserts position at the end of a line
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.