[英]Regex search pair of parenthesis at different lines in python
This is the example string in a file I work with: 这是我使用的文件中的示例字符串:
apple (sweet
fruit) at home
If I want to find anything between parenthesis and remove it, how to do it? 如果我想在括号之间找到任何内容并将其删除,该怎么办? This is the result that I expect for: 这是我期望的结果:
apple at home
I tried below but it doesn't work as above lines are two different lines. 我在下面尝试过,但由于上面的行是两条不同的行而无法正常工作。
re.sub(r'\(\s*([^)]+)\)', '', line)
Try: 尝试:
re.sub(r'\s*\([^)]+\)', '', line)
In a python regex, (
and )
are normally used for grouping. 在python正则表达式中, (
和)
通常用于分组。 Because you want to match literal parens, not do grouping, we replace (
by \\(
and we replace )
by \\)
. 因为您要匹配文字括号,而不是分组,所以我们用(
替换为\\(
然后替换为)
\\)
。
Example: 例:
>>> print(line)
apple (sweet
fruit) at home
>>> import re
>>> re.sub(r'\s*\([^)]+\)', '', line)
'apple at home'
Using the read
method, we can successfully do the multiline substitution: 使用read
方法,我们可以成功地进行多行替换:
>>> import re
>>> line = open('File').read()
>>> print(line)
apple (sweet
fruit) at home
>>> re.sub(r'\s*\([^)]+\)', '', line)
'apple at home\n'
If we use the readlines
methods, though, we have problems: 但是,如果使用readlines
方法, readlines
遇到问题:
>>> line = open('File').readlines()
>>> print(line)
['apple (sweet\n', ' fruit) at home\n']
readlines
creates a list of lines. readlines
创建行列表。 re.sub
requires a string not a list. re.sub
需要一个字符串而不是一个列表。 Therefore, we need to use join
to get a successful substitution: 因此,我们需要使用join
来获得成功的替换:
>>> re.sub(r'\s*\([^)]+\)', '', ''.join(line))
'apple at home\n'
You'll need to use re.MULTILINE and non-greedy match. 您需要使用re.MULTILINE和非贪婪匹配。
re.sub(r'\(.+?\)', '', line, re.MULTILINE)
Reference: https://docs.python.org/2/library/re.html 参考: https : //docs.python.org/2/library/re.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.