[英]Python regex - Find multiple characters in a substring
print 'cycle' ;While i in range(1,n) [[print "Number:" ;print i; print 'and,']]
I have a line like this for example. 例如,我有这样一条线。 I want to extract the semicolon characters only from the [[ ... ]] substring, inside the double square brackets.
我只想从双括号内的[[...]]子字符串中提取分号字符。
If I use re.search(\\[\\[.*(\\s*;).*\\]\\])
I get only one semicolon. 如果我使用
re.search(\\[\\[.*(\\s*;).*\\]\\])
我只会得到一个分号。 Is there a proper solution for this? 有合适的解决方案吗?
Regex is never a great choice for things like this because it's very easy to trip up, but the following pattern works in trivial cases : 正则表达式从来都不是此类事情的理想选择,因为它很容易绊倒,但是以下模式在平凡的情况下仍然有效:
;(?=(?:(?!\[\[).)*\]\])
Pattern breakdown: 模式细分:
; # match literal ";"
(?= # lookahead assertion: assert the following pattern matches:
(?:
(?!\[\[) # as long as we don't find a "[["...
. # ...consume the next character
)* # ...as often as necessary
\]\] # until we find "]]"
)
In other words, the pattern checks if a semicolon is followed by ]]
, but not followed by [[
. 换句话说,该模式检查是否在分号后跟
]]
,而不是[[
。
Examples of strings where the pattern won't work: 模式不起作用的字符串示例:
; ]]
; ]]
(will match) ; ]]
(将匹配) [[ ; "this is text [[" ]]
[[ ; "this is text [[" ]]
(won't match) [[ ; "this is text [[" ]]
(不匹配)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.