[英]Using regex to get passage between two strings in Python
I want to parse all of the functions inside of a .txt file. 我想解析.txt文件中的所有功能。 It looks like this: 看起来像这样:
def
test
end
def
hello
end
def
world
end
So, I would get the following returned: [test, hello, world]
因此,我将返回以下内容: [test, hello, world]
Here is what I have tried, but I do not get anything back: 这是我尝试过的方法,但我什么也没回来:
r = re.findall('def(.*?)end', doc)
print r
You have to use the re.DOTALL
flag which will allow .
您必须使用re.DOTALL
标志,该标志将允许.
to match newlines too (since your doc
is multi-line). 也可以匹配换行符(因为您的doc
是多行的)。
You could additionally use '^def' and '^end' in the regex if you only wanted the outer def/end blocks (ie ignore indented ones), in which case you would also need to use the re.MULTILINE
flag, which allows '^' and '$' to match start/end of line (as opposed to start/end of string ). 如果只需要外部def / end块(即忽略缩进的块),则可以在正则表达式中另外使用'^ def'和'^ end',在这种情况下,您还需要使用re.MULTILINE
标志,该标志允许“^”以及“$”以匹配线的开始/结束(相对于开始的字符串 /结束)。
re.findall('^def(.*?)^end',doc,re.DOTALL|re.MULTILINE)
r = re.findall('def(.*?)end', doc, re.S)
You need to enable re.MULTILINE
flag to match multiple lines in a single regular expression. 您需要启用re.MULTILINE
标志以匹配单个正则表达式中的多行。 Also, ^
and $
do NOT match linefeeds ( \\n
) 此外, ^
和$
与换行符( \\n
) 不匹配
>>> re.findall(r"^def$\n(.*)\n^end$", doc, re.MULTILINE)
[' test', ' hello', ' world']
If you don't want to match the whitespace in the beginning of the blocks, add \\W+
: 如果不想在块的开头匹配空白,请添加\\W+
:
>>> re.findall(r"^def$\n\W*(.*)\n^end$", text, re.MULTILINE)
['test', 'hello', 'world']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.