python：在文件中提取（正则表达式）模式而无需逐行（多行搜索）

Question

I can extract a particualr pattern by reading mystring.txt file line by line and checking the line against re.search(r'pattern',line_txt) method.我可以通过逐行读取 mystring.txt 文件并根据 re.search(r'pattern',line_txt) 方法检查该行来提取特定模式。

Following is the mystring.txt以下是mystring.txt

` `

Client: //home/SCM/dev/applications/build_system/test_suite_linux/unit_testing



Stream: //MainStream/testing_branch

Options:    dir, norm accel, ddl



SubmitOptions:  vis, dir, cas, cat

` `

using python, I can get the stream name as //MainStream/testing_branch使用 python，我可以得到流名称为 //MainStream/testing_branch

import re 
with open("mystring.txt",'r') as f:
    mystring= f.readlines()
    for line in mystring:
        if re.search(r'^Stream\:',line):

            stream_name = line.split('\t')[1]
            print stream_name

instead of going line by line in a loop, how is it possible to extract the same information by only using the re module?不是逐行循环，如何仅使用 re 模块来提取相同的信息？

Answer 1

You can read the file in one go and use re.findall (beware if the file is too large, loading it to main memory will not be good idea)您可以re.findall读取文件并使用re.findall （请注意，如果文件太大，将其加载到主内存将不是一个好主意）

import re
content = open("input_file").read()
print(re.findall("^Stream: (.*)", content, re.M))

Answer 2

Yes, you can use: re.MULTILINE with re.search(..) .是的，您可以使用： re.MULTILINE和re.search(..) 。

>>> import re
>>> re.search(r'^Stream\:\s([^\n]+)', f.read(), re.MULTILINE).group(1)
'//MainStream/testing_branch'

Answer 3

Here is the solution这是解决方案

f = open("mystring.txt").read()

import re

got = re.findall("Stream: .+\n", f)

got = got[0].strip()

print(got.split(": ")[1])

python：在文件中提取（正则表达式）模式而无需逐行（多行搜索）

问题描述

3 个解决方案

解决方案1
2 2016-05-19 17:55:29

解决方案2
1 已采纳 2016-05-19 17:49:31

解决方案3
0 2016-05-19 17:59:12

Here is the solution这是解决方案

python：在文件中提取（正则表达式）模式而无需逐行（多行搜索）

问题描述

3 个解决方案

解决方案1 2 2016-05-19 17:55:29

解决方案2 1 已采纳 2016-05-19 17:49:31

解决方案3 0 2016-05-19 17:59:12

Here is the solution这是解决方案

解决方案1
2 2016-05-19 17:55:29

解决方案2
1 已采纳 2016-05-19 17:49:31

解决方案3
0 2016-05-19 17:59:12