來自文件內容和計數出現次數的匹配模式

Question

我正在嘗試讀取文件的內容，並檢查是否使用正則表達式匹配模式列表。

檔案內容：

google.com
https://google.com
yahoo.com
www.yahoo.com
yahoo

我的代碼：

import re
file = 'data_files/test_files/content.txt'

regex_1 = re.compile("google")
regex_2 = re.compile("yahoo")

data = open(file, 'r')

print ("Checking Regex 1")
if regex_1.match(data.read()):
    count_c = len(regex_1.findall(data.read()))
    print ("Matched Regex 1 - " + str(count_c))
print("Checking Regex 2")

if regex_2.match(data.read()):
    count_d = len(regex_2.findall(data.read()))
    print("Matched Regex 2 -  " + str(count_d))
else:
    print ("No match found")

輸出：

Checking Regex 1
Checking Regex 2
No match found

無法找出問題所在。

Answer 1

每次調用data.read() ，它都會從文件中最后一次調用結束的位置開始讀取。 由於第一個調用讀取了整個文件（因為您未指定限制），因此所有其余的調用都從文件末尾開始讀取，因此它們不讀取任何內容。

您應該將文件讀入變量，然后使用該變量而不是重復調用data.read() 。

您還需要使用re.search() ，而不是re.match() 。 請參閱re.search和re.match有什么區別？

import re
file = 'data_files/test_files/content.txt'

regex_1 = re.compile("google")
regex_2 = re.compile("yahoo")

with open(file, 'r') as data:

print ("Checking Regex 1")
if regex_1.search(contents):
    count_c = len(regex_1.findall(contents))
    print ("Matched Regex 1 - " + str(count_c))

print("Checking Regex 2")
if regex_2.search(contents):
    count_d = len(regex_2.findall(contents))
    print("Matched Regex 2 -  " + str(count_d))
else:
    print ("No match found")

來自文件內容和計數出現次數的匹配模式

問題描述

1 個解決方案

解決方案1
1 已采納 2019-04-06 15:47:33

來自文件內容和計數出現次數的匹配模式

問題描述

1 個解決方案

解決方案1 1 已采納 2019-04-06 15:47:33

解決方案1
1 已采納 2019-04-06 15:47:33