替换以单词开头的字符串，直到两个换行符

Question

I am new to regex in python and am trying to replace a substring inside a string. 我是python中的正则表达式的新手，正在尝试替换字符串中的子字符串。 My substring starts with a specific word and ends with two new line characters. 我的子字符串以一个特定的单词开头，以两个换行符结尾。

Below is what I tried 以下是我尝试过的

import re
a=re.sub(r'Report from.+', r' ', 'To: abcd@gef.org;     Report from xxxxx \n     Category\t Score\t  \n xxxxxxxxxxx xxxxxxxxt  \n xxxxxxx\t xxxxxxx\t \n\n original message\n')

Output: 输出：

To: abcd@gef.org;      
     Category    Score    
 xxxxxxxxxxx xxxxxxxxt  
 xxxxxxx     xxxxxxx     

 original message

Expected Output: 预期产量：

To: abcd@gef.org;      
 original message

I also tried: 我也尝试过：

re.sub(r'Report from.+\n', r' ', 'To: abcd@gef.org;     Report from xxxxx \n     Category\t Score\t  \n xxxxxxxxxxx xxxxxxxxt  \n xxxxxxx\t xxxxxxx\t \n\n original message\n')

but it wasn't even matching "Report from" literal. 但它甚至与“ Report from”字面值都不匹配。

I think I am half-way there. 我想我已经中途了。 Can anyone please help? 谁能帮忙吗？

Edit: I want to replace everything that starts with "Report from" all the way until the first occurrence of two new-line characters 编辑：我想替换所有以“ Report from”开头的内容，直到第一次出现两个换行符

Answer 1

You want to use ? 您要使用? to mark the 'end' of the substring you wish to replace. 标记要替换的子字符串的“结尾”。

import re

text = 'To: abcd@gef.org;     Report from xxxxx \n     Category\t Score\t  \n xxxxxxxxxxx xxxxxxxxt  \n xxxxxxx\t xxxxxxx\t \n\n original message\n'

a=re.sub(r'Report from.+?\n\n', r'\n', text, flags=re.DOTALL)

print(a)

To: abcd@gef.org;     
 original message

Answer 2

Consider writing a simple state machine to do this. 考虑编写一个简单的状态机来执行此操作。 You have two states: you are looking for the first line in a block, or you are in a block and looking for the blank line. 您有两种状态：正在块中寻找第一行，或者在块中寻找空行。 ("Two consecutive newlines" is the same as "I see a blank line when I read through the file line by line".) （“两个连续的换行符”与“当我逐行读取文件时看到空白行”相同。）

import enum from Enum, auto

class LookFor(Enum):
  REPORT = auto()
  BLANK = auto()

state = LookFor.REPORT
with open(filename, 'r') as f:
  for line in f:
    if state == LookFor.REPORT:
      print(line, end='')
      if line.startswith('Report from'):
        state = LookFor.BLANK
    elif state == LookFor.BLANK:
      if line == '\n':
        print(line, end='')
        state = LookFor.TO

The specific code I've written makes some assumptions about what you're looking for, and in particular that you can iterate through it line-by-line; 我编写的特定代码对要查找的内容进行了一些假设，尤其是可以逐行进行迭代。 you could adapt this to make more complex decisions about what state to switch to or add additional states as suited your application. 您可以对此进行调整，以针对要切换到哪个状态做出更复杂的决定，或者添加适合您的应用程序的其他状态。

替换以单词开头的字符串，直到两个换行符

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-07-19 20:31:48

解决方案2
1 2018-07-20 00:20:34

替换以单词开头的字符串，直到两个换行符

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-07-19 20:31:48

解决方案2 1 2018-07-20 00:20:34

解决方案1
1 已采纳 2018-07-19 20:31:48

解决方案2
1 2018-07-20 00:20:34