[英]Regular expression find and replace multiple
I am trying to write a regular expression that will match all cases of 我试图写一个正则表达式,将匹配所有情况
[[any text or char her]]
in a series of text. 在一系列文本中。
Eg: 例如:
My name is [[Sean]]
There is a [[new and cool]] thing here.
This all works fine using my regex. 使用我的正则表达式,一切都很好。
data = "this is my tes string [[ that does some matching ]] then returns."
p = re.compile("\[\[(.*)\]\]")
data = p.sub('STAR', data)
The problem is when I have multiple instances of the match occuring :[[hello]] and [[bye]] 问题是当我发生匹配的多个实例时:[[hello]]和[[bye]]
Eg: 例如:
data = "this is my new string it contains [[hello]] and [[bye]] and nothing else"
p = re.compile("\[\[(.*)\]\]")
data = p.sub('STAR', data)
This will match the opening bracket of hello and the closing bracket of bye. 这将与hello的左括号和bye的右括号匹配。 I want it to replace them both.
我希望它取代它们两者。
.*
is greedy and matches as much text as it can, including ]]
and [[
, so it plows on through your "tag" boundaries. .*
是贪婪的,它会匹配尽可能多的文本,包括]]
和[[
,所以它贯穿您的“标签”边界。
A quick solution is to make the star lazy by adding a ?
一种快速的解决方案是通过添加
?
使星星变懒?
: :
p = re.compile(r"\[\[(.*?)\]\]")
A better (more robust and explicit but slightly slower) solution is to make it clear that we cannot match across tag boundaries: 更好的解决方案(更健壮和显式,但速度稍慢)是要明确我们不能跨标记边界进行匹配:
p = re.compile(r"\[\[((?:(?!\]\]).)*)\]\]")
Explanation: 说明:
\[\[ # Match [[
( # Match and capture...
(?: # ...the following regex:
(?!\]\]) # (only if we're not at the start of the sequence ]]
. # any character
)* # Repeat any number of times
) # End of capturing group
\]\] # Match ]]
Use ungreedy matching .*?
使用不匹配的匹配
.*?
<~~ the ?
<~~
?
after a +
or *
makes it match as few characters as possible. +
或*
使其与尽可能少的字符匹配。 The default is to be greedy, and consume as many characters as possible. 默认值是贪婪,并且消耗尽可能多的字符。
p = re.compile("\[\[(.*?)\]\]")
You can use this: 您可以使用此:
p = re.compile(r"\[\[[^\]]+\]\]")
>>> data = "this is my new string it contains [[hello]] and [[bye]] and nothing else"
>>> p = re.compile(r"\[\[[^\]]+\]\]")
>>> data = p.sub('STAR', data)
>>> data
'this is my new string it contains STAR and STAR and nothing else'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.