简体   繁体   English

替换两个字符串之间的字符串,除非它包含子字符串

[英]Replace string between two strings unless it contains a substring

I have a multiline string with three of the following lines of the following form:我有一个多行字符串,其中包含以下形式的以下三行:

Text1 Text2a Text3
Text1 Text2b Text3
Text1 Text2! Text3

I wish to replace all texts between Text1 and Text3 with Text4 , unless the intermediate text contains the character !我希望用Text4替换Text1Text3之间的所有文本,除非中间文本包含字符! . . Thus, the desired output is:因此,所需的输出是:

Text1 Text4 Text3
Text1 Text4 Text3
Text1 Text2! Text3

Let c be the multiline string above.c为上面的多行字符串。 I believe re.sub is the natural choice for this problem, so I tried the following:我相信re.sub是这个问题的自然选择,所以我尝试了以下方法:

c = re.sub("Text1(.*?)(?,=\,)Text3", "Text1 Text4 Text3". c, flags=re.DOTALL)

However, it replaces every intermediate text with Text4 .但是,它将每个中间文本替换为Text4 That is, I get the following output:也就是说,我得到以下输出:

Text1 Text4 Text3
Text1 Text4 Text3
Text1 Text4 Text3

How can I resolve this?我该如何解决这个问题?

I would phrase this as:我会这样说:

import re

c = """Text1 Text2a Text3
Text1 Text2b Text3
Text1 Text2! Text3"""

c = re.sub("^Text1(?: [^\s!]+)+ Text3$", "Text1 Text4 Text3", c, flags=re.M)
print(c)

This prints:这打印:

Text1 Text4 Text3
Text1 Text4 Text3
Text1 Text2! Text3

Here is an explanation of the regex pattern used:以下是对所用正则表达式模式的解释:

  • ^ from the start of the line ( re.M is multiline mode) ^从行首开始( re.M是多行模式)
  • Text1 match "Text1" Text1匹配“文本 1”
  • (?: [^\s!]+)+ then match one or more non whitespace terms NOT containing ! (?: [^\s!]+)+然后匹配一个或多个不包含!的非空白术语
  • Text3 match space and "Text3" Text3匹配空格和“Text3”
  • $ end of the line $行尾

You don't really need a negative lookahead to achieve your results.你真的不需要negative lookahead来实现你的结果。 Matching anything except !匹配任何东西,除了! character would do just fine.性格会做的很好。 Modifying your regex as follows fixes the issue:按如下方式修改您的正则表达式可解决此问题:

c = re.sub("Text1([^\!]*?)Text3", "Text1 Text4 Text3", c, flags=re.DOTALL)

You can play with it online here and understand more about the regex here .您可以在此处在线试用它并在此处了解有关正则表达式的更多信息

Use the less greedy.*?使用不那么贪婪的。*? pattern to match as little text as possible before attempting to match the next pattern to resolve this problem.在尝试匹配下一个模式之前匹配尽可能少的文本以解决此问题。 You can also use a positive lookahead assertion, (?=, ), to determine whether the: character is present in the intermediate text, as in the following example:您还可以使用正向先行断言 (?=, ) 来确定中间文本中是否存在 : 字符,如以下示例所示:

import re重新进口

c = """Text1 Text2a Text3 Text1 Text2b Text3 Text1 Text2! Text3""" c = """Text1 Text2a Text3 Text1 Text2b Text3 Text1 Text2!Text3"""

c = re.sub(r"Text1(. ?)(?=,)Text3", "Text1 Text2, Text3". c. flags=re.DOTALL) c = re.sub(r"Text1(. ?)Text3", "Text1 Text4 Text3", c, flags=re.DOTALL) c = re.sub(r"Text1(. ?)(?=,)Text3", "Text1 Text2, Text3".c.flags=re.DOTALL) c = re.sub(r"Text1(. ?)Text3 ", "Text1 Text4 Text3", c, flags=re.DOTALL)

print(c)打印(c)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果包含子字符串,则同时替换多个字符串 - Replace multiple strings simultaneously if it contains substring 用字符串中的字符串替换字符串中的字符串 - Replace strings in a string by a substring of those strings 检查字符串数组是否包含另一个字符串的 substring - Check if an array of strings contains a substring of another string 如果熊猫数据框中包含特定的子字符串,请替换该字符串 - Replace string in pandas dataframe if it contains specific substring 如果它包含熊猫中的子字符串,则替换整个字符串 - Replace whole string if it contains substring in pandas 如果整个字符串包含熊猫数据框中的子字符串,则替换整个字符串 - Replace whole string if it contains substring in pandas dataframe 如果字符串在 PySpark 中包含某些 substring,则替换字符串 - Replace string if it contains certain substring in PySpark 根据它们所在的位置用两个不同的字符串替换一个子字符串 - Replace a substring with two different strings depending on where they are 如何多次替换两个分隔符/字符串之间的唯一字符串 - How to replace unique string between two separators/strings multiple times 给定两个字符串列表,找出第二个列表中包含第一个列表中的任何字符串作为子字符串的字符串总数 - Given two lists of strings, find the total number of strings in the second list which contains any string in the first list as substring
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM