[英]Regex string between square brackets only if '.' is within string
I'm trying to detect the text between two square brackets in Python however I only want the result where there is a "."我正在尝试检测 Python 中两个方括号之间的文本,但是我只想要有“。”的结果。 within it.
在其中。
I currently have [(.*?] as my regex, using the following example:我目前有 [(.*?] 作为我的正则表达式,使用以下示例:
String To Search: CASE[Data Source].[Week] = 'THIS WEEK'要搜索的字符串: CASE[Data Source].[Week] = 'THIS WEEK'
Result: Data Source, Week结果:数据源,周
However I need the whole string as [Data Source].[Week], (square brackets included, only if there is a '.' in the middle of the string).但是,我需要将整个字符串作为 [Data Source].[Week],(包括方括号,仅当字符串中间有 '.' 时)。 There could also be multiple instances where it matches.
也可能有多个匹配的实例。
You might write a pattern matching [...]
and then repeat 1 or more times a .
您可能会编写一个模式匹配
[...]
,然后重复 1 次或多次 a .
and again [...]
并再次
[...]
\[[^][]*](?:\.\[[^][]*])+
Explanation解释
\[[^][]*]
Match from [...]
using a negated character class \[[^][]*]
使用否定字符 class 从[...]
匹配(?:
Non capture group to repeat as a whole part (?:
非捕获组作为一个整体重复
\.\[[^][]*]
Match a dot and again [...]
\.\[[^][]*]
匹配一个点,然后再次[...]
)+
Close the non capture group and repeat 1+ times )+
关闭非捕获组并重复 1+ 次See a regex demo .查看正则表达式演示。
To get multiple matches, you can use re.findall要获得多个匹配项,您可以使用 re.findall
import re
pattern = r"\[[^][]*](?:\.\[[^][]*])+"
s = ("CASE[Data Source].[Week] = 'THIS WEEK'\n"
"CASE[Data Source].[Week] = 'THIS WEEK'")
print(re.findall(pattern, s))
Output Output
['[Data Source].[Week]', '[Data Source].[Week]']
If you also want the values of between square brackets when there is not dot, you can use an alternation with lookaround assertions:如果在没有点的情况下还需要方括号之间的值,则可以使用带有环视断言的替代方法:
\[[^][]*](?:\.\[[^][]*])+|(?<=\[)[^][]*(?=])
Explanation解释
\[[^][]*](?:\.\[[^][]*])+
The same as the previous pattern \[[^][]*](?:\.\[[^][]*])+
同上一个模式|
Or(?<=\[)[^][]*(?=])
Match [...]
asserting [
to the left and ]
to the right (?<=\[)[^][]*(?=])
匹配[...]
断言[
到左边和]
到右边See another regex demo查看另一个正则表达式演示
I think an alternative approach could be:我认为另一种方法可能是:
import re
pattern = re.compile("(\[[^\]]*\]\.\[[^\]]*\])")
print(pattern.findall(sss))
OUTPUT OUTPUT
['[Data Source].[Week]']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.