用re解析python文件

Question

我有一個python文件

test.py 

import os
class test():

    def __init__(self):
        pass

    def add(num1, num2):
        return num1+num2

我正在讀取以下字符串形式的文件：

with open('test.py', 'r') as myfile:
    data=myfile.read()

print data

現在，我的數據包含所有行和換行的字符串。 我需要找到以課程開始和定義開始的行。

例如：

我需要將輸出打印為：

class test():
def __init__(self):
def add(num1, num2):

如何使用正則表達式處理此問題？

Answer 1

因此，如果您需要查找所有def和class行，則避免正則表達式要容易得多。

您在這里閱讀了文件的全部內容

with open('test.py', 'r') as myfile:
    data=myfile.read()

print data

您為什么不在那里找到答案呢？

with open('test.py', 'r') as myfile:
    for line in myfile:
        stripped = line.strip()  # get rid of spaces left and right
        if stripped.startswith('def') or stripped.startswith('class'):
             print(line)

要按要求使用整個字符串：

import re
with open('test.py', 'r') as myfile:
    data = myfile.read()

print(data)

print(re.findall("class.+\n|def.+\n",data))

從評論中可以看到，這也將與“定義為bla bla”匹配。 所以最好用

print(re.findall("class .+\n|def .+\n",data))

Answer 2

如果要遵循正則表達式方法，請使用

re.findall(r'(?m)^[ \t]*((?:class|def)[ \t].*)', data)

要么

re.findall(r'^[ \t]*((?:class|def)[ \t].*)', data, flags=re.M)

見正則表達式演示

關鍵是您應該使用^作為行錨的開頭（因此，必須在開始或re.M標志處加上(?m) ），然后再匹配水平空格（使用[ \\t] ），然后選擇任一class或def （使用(?:class|def) ），然后再輸入一個空格或制表符，然后再輸入0+個除換行符（ .* ）以外的字符。

如果您還打算處理Unicode空格，則需要用[^\\S\\r\\n\\f\\v]替換[ \\t] [^\\S\\r\\n\\f\\v] （並使用re.UNICODE標志）。

Python演示：

import re
p = re.compile(r'^[ \t]*((?:class|def)[ \t].*)', re.MULTILINE)
s = "test.py \n\nimport os\nclass test():\n\n    def __init__(self):\n        pass\n\n    def add(num1, num2):\n        return num1+num2"
print(p.findall(s))
# => ['class test():', 'def __init__(self):', 'def add(num1, num2):']

Answer 3

with open('test.py', 'r') as myfile:
    data=myfile.read().split('\n')
    for line in data:
        if re.search("(\s+)?class ", line) or re.search("^\s+def ", line):
            print line

用re解析python文件

問題描述

3 個解決方案

解決方案1
2 2016-08-04 09:10:57

解決方案2
2 已采納 2016-08-04 09:27:29

解決方案3
1 2016-08-04 09:18:56

用re解析python文件

問題描述

3 個解決方案

解決方案1 2 2016-08-04 09:10:57

解決方案2 2 已采納 2016-08-04 09:27:29

解決方案3 1 2016-08-04 09:18:56

解決方案1
2 2016-08-04 09:10:57

解決方案2
2 已采納 2016-08-04 09:27:29

解決方案3
1 2016-08-04 09:18:56