Python正则表达式匹配多行文本

Question

I have a text in a file. 我的文件中有文字。

INCLUDE '.\..\..\
FE_10-28\
ASSY.bdf'

INCLUDE '.\..\..\FE_10-28\standalone\COORD.bdf'

$ INCLUDE '.\..\..\FE_10-28\standalone\bracket.bdf'

$ INCLUDE '.\..\..\
$ FE_10-28\standalone\
$ ITFC.bdf'

I would like to have an expression to capture strings (lines beginning with $ should be skipped): 我想要一个表达式来捕获字符串（以$开头的行应跳过）：

['.\..\..\FE_10-28\ASSY.bdf', '.\..\..\FE_10-28\standalone\COORD.bdf']

I managed to filter single line string: 我设法过滤了单行字符串：

    with open(bdf_name,'r') as f:
        file_buff = f.readlines()

    text = ''.join(file_buff)
    regex_incl = re.compile("[^$]\s+include\s+\'(.*)\'",re.IGNORECASE|re.MULTILINE)
    print(regex_incl.findall(text))

But, how would it be for the multiline? 但是，多线路情况如何？

Answer 1

In the first place, you need the flag re.DOTALL , otherwise a dot . 首先，您需要标记re.DOTALL ，否则需要一个点. does not match newlines. 与换行符不匹配。 And read all the data at once. 并一次读取所有数据。

with open(bdf_name, 'r') as f:
    data = r.read()

re.findall("^include\s+\'(.*?)\'", data, 
           flags=re.IGNORECASE|re.MULTILINE|re.DOTALL)
#['.\\..\\..\\\nFE_10-28\\\nASSY.bdf', '.\\..\\..\\FE_10-28\\standalone\\COORD.bdf']

If you do not want the line breaks, remove them with .replace("\\n","") . 如果您不希望换行，请使用.replace("\\n","")删除它们。

Answer 2

You can use this regex : 您可以使用此regex ：

>>> raw = '''
... INCLUDE '.\..\..\
FE_10-28\
ASSY.bdf'

INCLUDE '.\..\..\FE_10-28\standalone\COORD.bdf'

$ INCLUDE '.\..\..\FE_10-28\standalone\bracket.bdf'

$ INCLUDE '.\..\..\
$ FE_10-28\standalone\
$ ITFC.bdf'... ... ... ... ... ... ... ... ... ...
... '''
>>>
>>> re.findall(r"^INCLUDE\s+'(.+?)'\n", raw, re.M|re.DOTALL)
['.\\..\\..FE_10-28ASSY.bdf', '.\\..\\..\\FE_10-28\\standalone\\COORD.bdf']

Python正则表达式匹配多行文本

问题描述

2 个解决方案

解决方案1
2 2018-03-27 06:48:24

解决方案2
2 已采纳 2018-03-27 06:54:18

Python正则表达式匹配多行文本

问题描述

2 个解决方案

解决方案1 2 2018-03-27 06:48:24

解决方案2 2 已采纳 2018-03-27 06:54:18

解决方案1
2 2018-03-27 06:48:24

解决方案2
2 已采纳 2018-03-27 06:54:18