Python正则表达式-查找两个定界符之间的所有子字符串

Question

I've been dealing with this problem for over a day already and i just can't figure it out.. 我已经解决这个问题超过一天了，但我无法解决。

The problem i have is following: Given the text: 我遇到的问题如下：给定文本：

Obratite pažnju na sljedece: Obratitepažnjuna sljedece：
Pad prometa Pad Prometa
Rentabilnost imovine Rentabilnost imovine
Neto maržu 内托·马尔祖（Netomaržu）

************************************************************** ************************************************** ************

I need to extract all the text that is between word "sljedece:" ( without qouatiton marks) and the row of asterisks. 我需要提取单词“ sljedece：”（没有qouatiton标记）和星号行之间的所有文本。

I tried to use the following code: 我尝试使用以下代码：

import re

text =  """
Obratite pažnju na sljedece:
Pad prometa
Rentabilnost imovine
Neto maržu

**************************************************************
"""
pattern = r"sljecece:(.*?)\*+"
napomene = re.findall(pattern, text)

print(napomene)

But it prints out an empty list. 但它会打印出一个空列表。

Thx to everyone in advance! 提前向大家致谢！

Answer 1

You have to pass re.DOTALL to make . 您必须通过re.DOTALL才能进行. match newlines: 匹配换行符：

re.findall(pattern, text, re.DOTALL)

You also have a typo on your pattern r"sljecece:(.*?)\\*+" should be r"sljedece:(.*?)\\*+" . 您的模式r"sljecece:(.*?)\\*+"上也有错字r"sljecece:(.*?)\\*+"应该是r"sljedece:(.*?)\\*+" 。

Answer 2

To be more efficient, you can limit the impact of the lazy quantifier grabbing entire lines until the asterisk line: 为了提高效率，您可以限制惰性量词捕获整行的影响，直到星号行为止：

re.findall(r'\bsljedece:((?:.*\n)+?)\*+$', text, re.M)

Perhaps the re.search method is more appropriate in your case. 也许re.search方法更适合您的情况。

Python正则表达式-查找两个定界符之间的所有子字符串

问题描述

2 个解决方案

解决方案1
4 已采纳 2016-11-06 16:46:27

解决方案2
0 2016-11-06 17:08:08

Python正则表达式-查找两个定界符之间的所有子字符串

问题描述

2 个解决方案

解决方案1 4 已采纳 2016-11-06 16:46:27

解决方案2 0 2016-11-06 17:08:08

解决方案1
4 已采纳 2016-11-06 16:46:27

解决方案2
0 2016-11-06 17:08:08