[英]Search for multiple regexes in multiple files and then output each match and its respective file
我正在嘗試將輸出格式化為表格。 例如,所有匹配的文件都將作為列,而matchs實例應該是行。
這是我的代碼:
import glob
import re
folder_path = "/home/e136320"
file_pattern = "/*.txt"
match_list = []
folder_contents = glob.glob(folder_path + file_pattern)
#Search for Emails
regex1= re.compile(r'\S+@\S+')
#Search for Phone Numbers
regex2 = re.compile(r'\d\d\d[-]\d\d\d[-]\d\d\d\d')
#Search for Physician's Name
regex3=re.compile(r'\b\w\w\.\w+\b')
for file in folder_contents:
read_file = open(file, 'rt').read()
words=read_file.split()
for line in words:
email=regex1.findall(line)
phone=regex2.findall(line)
for word in email:
print(file,email)
for word in phone:
print(file,phone)
這是我的輸出:
('/home/e136320/sample.txt', ['bcbs@aol.com'])
('/home/e136320/sample.txt', ['James@aol.com'])
('/home/e136320/sample.txt', ['248-981-3420'])
('/home/e136320/wow.txt', ['soccerfif@yahoo.com'])
('/home/e136320/wow.txt', ['313-806-6666'])
('/home/e136320/wow.txt', ['444-444-4444'])
('/home/e136320/wow.txt', ['248-805-6233'])
('/home/e136320/wow.txt', ['maliva@gmail.com'])
有任何想法嗎?
我會嘗試將找到的項目追加到列表中,以便組織結果並在循環之間保留它們。 然后,您可以嘗試將其打印出來。 您可以嘗試這樣的事情:
import glob
import re
folder_path = "/home/e136320"
file_pattern = "/*.txt"
match_list = []
folder_contents = glob.glob(folder_path + file_pattern)
# Search for Emails
regex1= re.compile(r'\S+@\S+')
# Search for Phone Numbers
regex2 = re.compile(r'\d\d\d[-]\d\d\d[-]\d\d\d\d')
# Search for Physician's Name
regex3=re.compile(r'\b\w\w\.\w+\b')
results = {}
for file in folder_contents:
read_file = open(file, 'rt').read()
words=read_file.split()
current_results = []
for line in words:
email=regex1.findall(line)
phone=regex2.findall(line)
for word in email:
# Append email Regex matches to a list
current_results.append(word)
for word in phone:
# Append phone Regex matches to a list
current_results.append(word)
# Save results per file in a dictionary
# The file name is the key.
results[file] = current_results
for key in results.keys():
print(key, [str(item) for item in results[key]]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.