简体   繁体   English

Python正则表达式-查找两个定界符之间的所有子字符串

[英]Python regex - finding all substrings between two delimiters

I've been dealing with this problem for over a day already and i just can't figure it out.. 我已经解决这个问题超过一天了,但我无法解决。

The problem i have is following: Given the text: 我遇到的问题如下:给定文本:

Obratite pažnju na sljedece: Obratitepažnjuna sljedece:
Pad prometa Pad Prometa
Rentabilnost imovine Rentabilnost imovine
Neto maržu 内托·马尔祖(Netomaržu)

************************************************************** ************************************************** ************

I need to extract all the text that is between word "sljedece:" ( without qouatiton marks) and the row of asterisks. 我需要提取单词“ sljedece:”(没有qouatiton标记)和星号行之间的所有文本。

I tried to use the following code: 我尝试使用以下代码:

import re

text =  """
Obratite pažnju na sljedece:
Pad prometa
Rentabilnost imovine
Neto maržu

**************************************************************
"""
pattern = r"sljecece:(.*?)\*+"
napomene = re.findall(pattern, text)

print(napomene)

But it prints out an empty list. 但它会打印出一个空列表。

Thx to everyone in advance! 提前向大家致谢!

You have to pass re.DOTALL to make . 您必须通过re.DOTALL才能进行. match newlines: 匹配换行符:

re.findall(pattern, text, re.DOTALL)

You also have a typo on your pattern r"sljecece:(.*?)\\*+" should be r"sljedece:(.*?)\\*+" . 您的模式r"sljecece:(.*?)\\*+"上也有错字r"sljecece:(.*?)\\*+"应该是r"sljedece:(.*?)\\*+"

To be more efficient, you can limit the impact of the lazy quantifier grabbing entire lines until the asterisk line: 为了提高效率,您可以限制惰性量词捕获整行的影响,直到星号行为止:

re.findall(r'\bsljedece:((?:.*\n)+?)\*+$', text, re.M)

Perhaps the re.search method is more appropriate in your case. 也许re.search方法更适合您的情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM