简体   繁体   English

使用正则表达式在Python中的两个字符串之间通过

[英]Using regex to get passage between two strings in Python

I want to parse all of the functions inside of a .txt file. 我想解析.txt文件中的所有功能。 It looks like this: 看起来像这样:

def
  test
end

def
  hello
end

def
  world
end

So, I would get the following returned: [test, hello, world] 因此,我将返回以下内容: [test, hello, world]

Here is what I have tried, but I do not get anything back: 这是我尝试过的方法,但我什么也没回来:

    r = re.findall('def(.*?)end', doc)
    print r

You have to use the re.DOTALL flag which will allow . 您必须使用re.DOTALL标志,该标志将允许. to match newlines too (since your doc is multi-line). 也可以匹配换行符(因为您的doc是多行的)。

You could additionally use '^def' and '^end' in the regex if you only wanted the outer def/end blocks (ie ignore indented ones), in which case you would also need to use the re.MULTILINE flag, which allows '^' and '$' to match start/end of line (as opposed to start/end of string ). 如果只需要外部def / end块(即忽略缩进的块),则可以在正则表达式中另外使用'^ def'和'^ end',在这种情况下,您还需要使用re.MULTILINE标志,该标志允许“^”以及“$”以匹配线的开始/结束(相对于开始的字符串 /结束)。

re.findall('^def(.*?)^end',doc,re.DOTALL|re.MULTILINE)
r = re.findall('def(.*?)end', doc, re.S)

You need to enable re.MULTILINE flag to match multiple lines in a single regular expression. 您需要启用re.MULTILINE标志以匹配单个正则表达式中的多行。 Also, ^ and $ do NOT match linefeeds ( \\n ) 此外, ^$与换行符( \\n匹配

>>> re.findall(r"^def$\n(.*)\n^end$", doc, re.MULTILINE)
['  test', '  hello', '  world']

If you don't want to match the whitespace in the beginning of the blocks, add \\W+ : 如果不想在块的开头匹配空白,请添加\\W+

>>> re.findall(r"^def$\n\W*(.*)\n^end$", text, re.MULTILINE)
['test', 'hello', 'world']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM