re.findall 多行 python

Question

re.findall with re.M 没有找到我要搜索的多行

我正在尝试从文件中提取与模式匹配的所有多行字符串

来自文件book.txt ：

Title: Le Morte D'Arthur, Volume I (of II)
       King Arthur and of his Noble Knights of the Round Table

Author: Thomas Malory

Editor: William Caxton

Release Date: March, 1998  [Etext #1251]
Posting Date: November 6, 2009

Language: English

Title: Pride and Prejudice

Author: Jane Austen

Posting Date: August 26, 2008 [EBook #1342]
Release Date: June, 1998
Last Updated: October 17, 2016

Language: English

以下代码只返回第一行Le Morte D'Arthur, Volume I (of II)

re.findall('^Title:\s(.+)$', book, re.M)

我期待输出是

[' Le Morte D'Arthur, Volume I (of II)\\n King Arthur and of his Noble Knights of the Round Table', ' Pride and Prejudice']

澄清，
- 第二行是可选的，它在某些文件中存在，而在其他文件中不存在。 在第二行之后还有更多我不想阅读的文字。
- 使用re.findall(r'Title: (.+\\n.+)$', text, flags=re.MULTILINE)工作但如果第二行只是空白则失败。
- 我正在运行 python3.7。
- 我正在将 txt 文件转换为字符串，然后在 str 上运行re 。
- 以下也不起作用：
re.findall(r'^Title:\\s(.+)$', text, re.S)
re.findall(r'^Title:\\s(.+)$', text, re.DOTALL)

Answer 1

我猜可能是这个表情，

(?<=Title:\s)(.*?)\s*(?=Author)

可能接近可能需要的设计。

演示

测试

import re

regex = r"(?<=Title:\s)(.*?)\s*(?=Author)"

test_str = ("Title: Le Morte D'Arthur, Volume I (of II)\n"
    "       King Arthur and of his Noble Knights of the Round Table\n\n"
    "Title: Le Morte D'Arthur, Volume I (of II)\n"
    "       King Arthur and of his Noble Knights of the Round Table")

print(re.findall(regex, test_str, re.DOTALL))

输出

["Le Morte D'Arthur, Volume I (of II)\n       King Arthur and of his Noble Knights of the Round Table\n\n", "Le Morte D'Arthur, Volume I (of II)\n       King Arthur and of his Noble Knights of the Round Table"]

Answer 2

您可以使用带有DOTALL标志的正则表达式来允许您的. 匹配换行符：

re.findall('^Title:\s(.+)$', book, re.DOTALL)

输出：

Le Morte D'Arthur, Volume I (of II)\n       King Arthur and of his Noble Knights of the Round Table

re.findall 多行 python

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-07-18 15:06:03

演示

测试

输出

解决方案2
1 2019-07-18 15:12:13

re.findall 多行 python

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-07-18 15:06:03

演示

测试

输出

解决方案2 1 2019-07-18 15:12:13

解决方案1
1 已采纳 2019-07-18 15:06:03

解决方案2
1 2019-07-18 15:12:13