[英]Python regular expression - sometimes pattern can be at end of the string or sometimes it can be in the middle
I have a requirement to return a match if a following sub text is found but no other alphanumeric character should be before or after the sub text.如果找到以下子文本但子文本之前或之后不应有其他字母数字字符,我需要返回匹配项。
For example: searching for text "OCI" in a filename:例如:在文件名中搜索文本“OCI”:
import re
file_pattern = r".*([^a-zA-Z0-9]OCI[^a-zA-Z0-9]).*"
text = "rce oci "
m = re.match(file_pattern, text, re.IGNORECASE)
if m is not None:
print(m)
else:
print("no match found")
The above code works as intended text = "rce oci " -> match found - note there is an extra white space after oci here text = "rceoci" -> no match found上面的代码按预期工作 text = "rce oci " -> 找到匹配项 - 注意这里 oci 之后有一个额外的空格 text = "rceoci" -> 没有找到匹配项
But if text = "rce oci" it does not return the match - note there is no extra white space here.但是如果 text = "rce oci" 它不会返回匹配 - 请注意这里没有额外的空白。
How can I possibly fix this?我怎么可能解决这个问题?
Thanks谢谢
You can use word boundary in your pattern and change re.match
to re.search
:您可以在模式中使用单词边界并将
re.match
更改为re.search
:
import re
file_pattern = r"\bOCI\b"
text = "rce oci"
m = re.search(file_pattern, text, re.IGNORECASE)
if m is not None:
print(m)
else:
print("no match found")
Note re.match
checks for complete match which is not the case here with the updated pattern.请注意,
re.match
检查是否完全匹配,而更新后的模式并非如此。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.