简体   繁体   English

使用正则表达式去除多行 python 文档字符串

[英]Strip multiline python docstrings with regex

I want to strip all python docstrings out of a file using simple search and replace, and the following (extremely) simplistic regex does the job for one line doc strings:我想使用简单的搜索和替换从文件中删除所有 python 文档字符串,以下(非常)简单的正则表达式为一行文档字符串完成这项工作:

Regex101.com Regex101.com

""".*"""

How can I extend that to work with multi-liners?我如何扩展它以使用多线?

Tried to include \\s in a number of places to no avail.试图在许多地方包含\\s无济于事。

As you cannot use an inline s (DOTALL) modifier, the usual workaround to match any char is using a character class with opposite shorthand character classes:由于您不能使用内联s (DOTALL) 修饰符,匹配任何字符的常用解决方法是使用具有相反速记字符类的字符类:

"""[\s\S]*?"""

or

"""[\d\D]*?"""

or

"""[\w\W]*?"""

will match """ then any 0+ chars, as few as possible as *? is a lazy quantfiier, and then trailing """ .将匹配"""然后任何 0+ 个字符,尽可能少*?是一个惰性量词,然后是尾随"""

Sometimes there are multiline strings that are not docstrings.有时有不是文档字符串的多行字符串。 For example, you may have a complicated SQL query that extends across multiple lines.例如,您可能有一个跨多行扩展的复杂 SQL 查询。 The following attempts to look for multiline strings that appear before class definitions and after function definitions.以下尝试查找出现在类定义之前和函数定义之后的多行字符串。

import re

input_str = """'''
This is a class level docstring
'''
class Article:
    def print_it(self):
        '''
        method level docstring
        '''
        print('Article')
        sql = '''
SELECT * FROM mytable
WHERE DATE(purchased) >= '2020-01-01'
'''
"""
    
doc_reg_1 = r'("""|\'\'\')([\s\S]*?)(\1\s*)(?=class)'
doc_reg_2 = r'(\s+def\s+.*:\s*)\n(\s*"""|\s*\'\'\')([\s\S]*?)(\2[^\n\S]*)'
input_str = re.sub(doc_reg_1, '', input_str)
input_str = re.sub(doc_reg_2, r'\1', input_str)
print(input_str)

Prints:打印:

class Article:
    def print_it(self):
        print('Article')
        sql = '''
SELECT * FROM mytable
WHERE DATE(purchased) >= '2020-01-01'
'''

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM