简体   繁体   English

正则表达式搜索对括号在python中的不同行

[英]Regex search pair of parenthesis at different lines in python

This is the example string in a file I work with: 这是我使用的文件中的示例字符串:

apple (sweet
   fruit) at home

If I want to find anything between parenthesis and remove it, how to do it? 如果我想在括号之间找到任何内容并将其删除,该怎么办? This is the result that I expect for: 这是我期望的结果:

apple at home

I tried below but it doesn't work as above lines are two different lines. 我在下面尝试过,但由于上面的行是两条不同的行而无法正常工作。

re.sub(r'\(\s*([^)]+)\)', '', line)

Try: 尝试:

re.sub(r'\s*\([^)]+\)', '', line)

In a python regex, ( and ) are normally used for grouping. 在python正则表达式中, ()通常用于分组。 Because you want to match literal parens, not do grouping, we replace ( by \\( and we replace ) by \\) . 因为您要匹配文字括号,而不是分组,所以我们用(替换为\\(然后替换为) \\)

Example: 例:

>>> print(line)
apple (sweet

      fruit) at home
>>> import re
>>> re.sub(r'\s*\([^)]+\)', '', line)
'apple at home'

Issues with reading a multiline string from a file 从文件中读取多行字符串的问题

Using the read method, we can successfully do the multiline substitution: 使用read方法,我们可以成功地进行多行替换:

>>> import re
>>> line = open('File').read()
>>> print(line)
apple (sweet
   fruit) at home

>>> re.sub(r'\s*\([^)]+\)', '', line)
'apple at home\n'

If we use the readlines methods, though, we have problems: 但是,如果使用readlines方法, readlines遇到问题:

>>> line = open('File').readlines()
>>> print(line)
['apple (sweet\n', '   fruit) at home\n']

readlines creates a list of lines. readlines创建行列表。 re.sub requires a string not a list. re.sub需要一个字符串而不是一个列表。 Therefore, we need to use join to get a successful substitution: 因此,我们需要使用join来获得成功的替换:

>>> re.sub(r'\s*\([^)]+\)', '', ''.join(line))
'apple at home\n'

You'll need to use re.MULTILINE and non-greedy match. 您需要使用re.MULTILINE和非贪婪匹配。

re.sub(r'\(.+?\)', '', line, re.MULTILINE)

Reference: https://docs.python.org/2/library/re.html 参考: https : //docs.python.org/2/library/re.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM