[英]Replace string between two strings unless it contains a substring
I have a multiline string with three of the following lines of the following form:我有一个多行字符串,其中包含以下形式的以下三行:
Text1 Text2a Text3
Text1 Text2b Text3
Text1 Text2! Text3
I wish to replace all texts between Text1
and Text3
with Text4
, unless the intermediate text contains the character !
我希望用
Text4
替换Text1
和Text3
之间的所有文本,除非中间文本包含字符!
. . Thus, the desired output is:
因此,所需的输出是:
Text1 Text4 Text3
Text1 Text4 Text3
Text1 Text2! Text3
Let c
be the multiline string above.令
c
为上面的多行字符串。 I believe re.sub
is the natural choice for this problem, so I tried the following:我相信
re.sub
是这个问题的自然选择,所以我尝试了以下方法:
c = re.sub("Text1(.*?)(?,=\,)Text3", "Text1 Text4 Text3". c, flags=re.DOTALL)
However, it replaces every intermediate text with Text4
.但是,它将每个中间文本替换为
Text4
。 That is, I get the following output:也就是说,我得到以下输出:
Text1 Text4 Text3
Text1 Text4 Text3
Text1 Text4 Text3
How can I resolve this?我该如何解决这个问题?
I would phrase this as:我会这样说:
import re
c = """Text1 Text2a Text3
Text1 Text2b Text3
Text1 Text2! Text3"""
c = re.sub("^Text1(?: [^\s!]+)+ Text3$", "Text1 Text4 Text3", c, flags=re.M)
print(c)
This prints:这打印:
Text1 Text4 Text3
Text1 Text4 Text3
Text1 Text2! Text3
Here is an explanation of the regex pattern used:以下是对所用正则表达式模式的解释:
^
from the start of the line ( re.M
is multiline mode) ^
从行首开始( re.M
是多行模式)Text1
match "Text1" Text1
匹配“文本 1”(?: [^\s!]+)+
then match one or more non whitespace terms NOT containing !
(?: [^\s!]+)+
然后匹配一个或多个不包含!
的非空白术语Text3
match space and "Text3" Text3
匹配空格和“Text3”$
end of the line $
行尾You don't really need a negative lookahead
to achieve your results.你真的不需要
negative lookahead
来实现你的结果。 Matching anything except !
匹配任何东西,除了
!
character would do just fine.性格会做的很好。 Modifying your regex as follows fixes the issue:
按如下方式修改您的正则表达式可解决此问题:
c = re.sub("Text1([^\!]*?)Text3", "Text1 Text4 Text3", c, flags=re.DOTALL)
You can play with it online here and understand more about the regex here .您可以在此处在线试用它并在此处了解有关正则表达式的更多信息。
Use the less greedy.*?使用不那么贪婪的。*? pattern to match as little text as possible before attempting to match the next pattern to resolve this problem.
在尝试匹配下一个模式之前匹配尽可能少的文本以解决此问题。 You can also use a positive lookahead assertion, (?=, ), to determine whether the: character is present in the intermediate text, as in the following example:
您还可以使用正向先行断言 (?=, ) 来确定中间文本中是否存在 : 字符,如以下示例所示:
import re重新进口
c = """Text1 Text2a Text3 Text1 Text2b Text3 Text1 Text2! Text3""" c = """Text1 Text2a Text3 Text1 Text2b Text3 Text1 Text2!Text3"""
c = re.sub(r"Text1(. ?)(?=,)Text3", "Text1 Text2, Text3". c. flags=re.DOTALL) c = re.sub(r"Text1(. ?)Text3", "Text1 Text4 Text3", c, flags=re.DOTALL) c = re.sub(r"Text1(. ?)(?=,)Text3", "Text1 Text2, Text3".c.flags=re.DOTALL) c = re.sub(r"Text1(. ?)Text3 ", "Text1 Text4 Text3", c, flags=re.DOTALL)
print(c)打印(c)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.