正则表达式查找并替换多个

Question

I am trying to write a regular expression that will match all cases of 我试图写一个正则表达式，将匹配所有情况

[[any text or char her]]

in a series of text. 在一系列文本中。

Eg: 例如：

My name is [[Sean]]
There is a [[new and cool]] thing here.

This all works fine using my regex. 使用我的正则表达式，一切都很好。

data = "this is my tes string [[ that does some matching ]] then returns."
p = re.compile("\[\[(.*)\]\]")
data = p.sub('STAR', data)

The problem is when I have multiple instances of the match occuring :[[hello]] and [[bye]] 问题是当我发生匹配的多个实例时：[[hello]]和[[bye]]

Eg: 例如：

data = "this is my new string it contains [[hello]] and [[bye]] and nothing else"
p = re.compile("\[\[(.*)\]\]")
data = p.sub('STAR', data)

This will match the opening bracket of hello and the closing bracket of bye. 这将与hello的左括号和bye的右括号匹配。 I want it to replace them both. 我希望它取代它们两者。

Answer 1

.* is greedy and matches as much text as it can, including ]] and [[ , so it plows on through your "tag" boundaries. .*是贪婪的，它会匹配尽可能多的文本，包括]]和[[ ，所以它贯穿您的“标签”边界。

A quick solution is to make the star lazy by adding a ? 一种快速的解决方案是通过添加?使星星变懒? : ：

p = re.compile(r"\[\[(.*?)\]\]")

A better (more robust and explicit but slightly slower) solution is to make it clear that we cannot match across tag boundaries: 更好的解决方案（更健壮和显式，但速度稍慢）是要明确我们不能跨标记边界进行匹配：

p = re.compile(r"\[\[((?:(?!\]\]).)*)\]\]")

Explanation: 说明：

\[\[        # Match [[
(           # Match and capture...
 (?:        # ...the following regex:
  (?!\]\])  # (only if we're not at the start of the sequence ]]
  .         # any character
 )*         # Repeat any number of times
)           # End of capturing group
\]\]        # Match ]]

Answer 2

Use ungreedy matching .*? 使用不匹配的匹配.*? <~~ the ? <~~ ? after a + or * makes it match as few characters as possible. +或*使其与尽可能少的字符匹配。 The default is to be greedy, and consume as many characters as possible. 默认值是贪婪，并且消耗尽可能多的字符。

p = re.compile("\[\[(.*?)\]\]")

Answer 3

You can use this: 您可以使用此：

p = re.compile(r"\[\[[^\]]+\]\]")

>>> data = "this is my new string it contains [[hello]] and [[bye]] and nothing else"
>>> p = re.compile(r"\[\[[^\]]+\]\]")
>>> data = p.sub('STAR', data)
>>> data
'this is my new string it contains STAR and STAR and nothing else'

正则表达式查找并替换多个

问题描述

3 个解决方案

解决方案1
3 已采纳 2012-10-31 12:07:11

解决方案2
2 2012-10-31 12:07:49

解决方案3
1 2012-10-31 12:12:32

正则表达式查找并替换多个

问题描述

3 个解决方案

解决方案1 3 已采纳 2012-10-31 12:07:11

解决方案2 2 2012-10-31 12:07:49

解决方案3 1 2012-10-31 12:12:32

解决方案1
3 已采纳 2012-10-31 12:07:11

解决方案2
2 2012-10-31 12:07:49

解决方案3
1 2012-10-31 12:12:32