Python正則表達式模式匹配問題

Question

我正在一個從事家庭作業的小組項目中，並且試圖使用我的合作伙伴python腳本從文本文件收集輸入。 我已經有一段時間沒有使用Python了，也從未設置過導入。

本質上，我是從具有重復模式的文本文件中加載數據。該模式如下。

SAMPLE INPUT 1:
3 1 1
1
3
SAMPLE OUTPUT 1:
3

SAMPLE INPUT 2:
4 2 2
2 3
1 4
SAMPLE OUTPUT 2:
4

SAMPLE INPUT 3:
8 2 2
1 8
5 4
SAMPLE OUTPUT 3:
5

該腳本嘗試使用re.findall（pattern，string，flags）收集數據。

with open("dp.txt","r") as f:
        file = f.read() #bad if taking in big file
        #the code stops executing here..I assume because no pattern is matched
        for m in re.findall("SAMPLE INPUT (\d):\n(\d+) (\d+) (\d+)\n(.*)\n(.*)\nSAMPLE OUTPUT (\d):\n(\d+)\n",file):

我不願意在Golden Platter上尋求解決方案，但是在這一點上，使我無法實現算法的唯一問題是愚蠢的模式匹配。 我希望有新的（有Python經驗的）眼睛，可以告訴我為什么re.findall（）與.txt文件不匹配

感謝您提供的任何建議，作為C程序員，我發現有關這些Python導入的文檔不足。.但這也許只是一個個人問題:)

Answer 1

我發現您對regex和re.findall（）的理解基本上是正確的。 問題全在於file = f.read() ，這里file是一個內置類。

另一方面，官方文檔絕對是最好的指南，而您可以在REPL下鍵入help（file），help（re.findall）。

還有一件事，

with open("dp.txt","r") as f:
        s = f.read()
#you better start the new line out of the with-block coz it releases the opening file
for m in re.findall("SAMPLE INPUT (\d):\n(\d+) (\d+) (\d+)\n(.*)\n(.*)\nSAMPLE OUTPUT (\d):\n(\d+)\n",s):

Answer 2

# load the regex module
import re

# make a method bound to a precompiled regex (can use it like a function)
get_samples = re.compile(
    r"SAMPLE INPUT (\d+):\n(.*?)\nSAMPLE OUTPUT \1:\n(.*?)(?:\n\n|$)",
    re.S     # make . match \n
).findall

# This regex looks for text like:
#
# SAMPLE INPUT {a number}:
# {data}
# SAMPLE OUTPUT {the same number}:
# {data}
# {a blank line OR end of file}
#

# load the data file
with open("myfile.txt") as inf:
    txt = inf.read()

# grab the data
data = get_samples(txt)

在給定的數據上導致

# data
[
    ('1', '3 1 1\n1\n3', '3'),
    ('2', '4 2 2\n2 3\n1 4', '4'),
    ('3', '8 2 2\n1 8\n5 4', '5')
]

數據元素需要進一步解析，例如

def ints(line):
    return [int(i) for i in line.split()]

def int_block(block):
    return [ints(line) for line in block.split("\n")]

data = [(int(a), int_block(b), ints(c)) for a,b,c in data]

這使

# data
[
    (1,   [[3, 1, 1], [1],    [3]   ],   [3]),
    (2,   [[4, 2, 2], [2, 3], [1, 4]],   [4]),
    (3,   [[8, 2, 2], [1, 8], [5, 4]],   [5])
]

Python正則表達式模式匹配問題

問題描述

2 個解決方案

解決方案1
0 2015-01-29 05:26:52

解決方案2
0 2015-01-29 05:58:55

Python正則表達式模式匹配問題

問題描述

2 個解決方案

解決方案1 0 2015-01-29 05:26:52

解決方案2 0 2015-01-29 05:58:55

解決方案1
0 2015-01-29 05:26:52

解決方案2
0 2015-01-29 05:58:55