使用正则表达式去除多行 python 文档字符串

Question

I want to strip all python docstrings out of a file using simple search and replace, and the following (extremely) simplistic regex does the job for one line doc strings:我想使用简单的搜索和替换从文件中删除所有 python 文档字符串，以下（非常）简单的正则表达式为一行文档字符串完成这项工作：

Regex101.com Regex101.com

""".*"""

How can I extend that to work with multi-liners?我如何扩展它以使用多线？

Tried to include \\s in a number of places to no avail.试图在许多地方包含\\s无济于事。

Answer 1

As you cannot use an inline s (DOTALL) modifier, the usual workaround to match any char is using a character class with opposite shorthand character classes:由于您不能使用内联s (DOTALL) 修饰符，匹配任何字符的常用解决方法是使用具有相反速记字符类的字符类：

"""[\s\S]*?"""

or或

"""[\d\D]*?"""

or或

"""[\w\W]*?"""

will match """ then any 0+ chars, as few as possible as *? is a lazy quantfiier, and then trailing """ .将匹配"""然后任何 0+ 个字符，尽可能少*?是一个惰性量词，然后是尾随""" 。

Answer 2

Sometimes there are multiline strings that are not docstrings.有时有不是文档字符串的多行字符串。 For example, you may have a complicated SQL query that extends across multiple lines.例如，您可能有一个跨多行扩展的复杂 SQL 查询。 The following attempts to look for multiline strings that appear before class definitions and after function definitions.以下尝试查找出现在类定义之前和函数定义之后的多行字符串。

import re

input_str = """'''
This is a class level docstring
'''
class Article:
    def print_it(self):
        '''
        method level docstring
        '''
        print('Article')
        sql = '''
SELECT * FROM mytable
WHERE DATE(purchased) >= '2020-01-01'
'''
"""
    
doc_reg_1 = r'("""|\'\'\')([\s\S]*?)(\1\s*)(?=class)'
doc_reg_2 = r'(\s+def\s+.*:\s*)\n(\s*"""|\s*\'\'\')([\s\S]*?)(\2[^\n\S]*)'
input_str = re.sub(doc_reg_1, '', input_str)
input_str = re.sub(doc_reg_2, r'\1', input_str)
print(input_str)

Prints:打印：

class Article:
    def print_it(self):
        print('Article')
        sql = '''
SELECT * FROM mytable
WHERE DATE(purchased) >= '2020-01-01'
'''

使用正则表达式去除多行 python 文档字符串

问题描述

2 个解决方案

解决方案1
4 已采纳 2017-06-13 21:49:02

解决方案2
0 2020-12-19 14:41:49

使用正则表达式去除多行 python 文档字符串

问题描述

2 个解决方案

解决方案1 4 已采纳 2017-06-13 21:49:02

解决方案2 0 2020-12-19 14:41:49

解决方案1
4 已采纳 2017-06-13 21:49:02

解决方案2
0 2020-12-19 14:41:49