[英]Python - search txt.file for ID, then return variable from line below
在Python中,我正在嘗試(非常糟糕)讀取.txt文件,找到引用特定客戶的最后一個字符串,並在下面讀幾行以獲得當前的點余額。
.txt文件的快照是:
Customer ID:123
Total sale amount:2345.45
Points from sale:23
Points until next bonus: 77
我可以搜索(並查找)特定的客戶ID,但無法弄清楚如何僅搜索此ID的最后一次出現,或者如何返回“直到下一次獎勵的點數”值......我為此道歉是一個基本問題,但任何幫助將不勝感激!
我的代碼到目前為止......
def reward_points():
#current points total
rewards = open('sales.txt', 'r')
line = rewards.readlines()
search = (str('Customer ID:') + str(Cust_ID))
print(search) #Customer ID:123
while line != ' ':
if line.startswith(search):
find('Points until next bonus:')
current_point_total = line[50:52]
cust_record = rewards.readlines()
print(current_point_total)
rewards.close()
獎勵分數()
我認為你最好將文件解析為結構化數據,而不是試圖尋找文件,這不是一種特別方便的文件格式。
這是一種建議的方法
用readline
迭代文件
通過匹配':'將行拆分為字段和標簽
將代表客戶的字段和標簽放入字典中
將代表客戶的字典放入另一個字典中
然后,您有一個內存數據庫,您可以通過dict查找取消引用
例如customers['1234']['Points until next bonus']
這是這種方法的簡化示例代碼
#!/usr/bin/env python
import re
# dictionary with all the customers in
customers = dict()
with open("sales.txt") as f:
#one line at a time
for line in f:
#pattern match on 'key : value'
field_match = re.match('^(.*):(.*)$',line)
if field_match :
# store the fields in variables
(key,value) = field_match.groups()
# Customer ID means a new record
if key == "Customer ID" :
# set a key for the 'customers database'
current_id = value
# if we have never seen this id before it's the first, make a record
if customers.get(current_id) == None :
customers[current_id] = []
# make the record an ordered list of dicts for each block
customers[current_id].append(dict())
# not a new record, so store the key and value in the dictionary at the end of the list
customers[current_id][-1][key] = value
# now customers is a "database" indexed on customer id
# where the values are a list of dicts of each data block
#
# -1 indexes the last of the list
# so the last customer's record for "123" is
print customers["123"][-1]["Points until next bonus"]
更新
我沒有意識到你有多個客戶塊,並且對訂購感興趣,所以我重新設計了示例代碼,以保留每個數據塊的有序列表,並根據客戶ID進行解析
這是itertools.groupby()
一個很好的用例,這個用例非常適合這種模式:
例:
from itertools import groupby, ifilter, imap
def search(d):
"""Key function used to group our dataset"""
return d[0] == "Customer ID"
def read_customer_records(filename):
"""Read customer records and return a nicer data structure"""
data = {}
with open(filename, "r") as f:
# clean adn remove blank lines
lines = ifilter(None, imap(str.strip, f))
# split each line on the ':' token
lines = (line.split(":", 1) for line in lines)
# iterate through each customer and their records
for newcustomer, records in groupby(lines, search):
if newcustomer:
# we've found a new customer
# create a new dict against their customer id
customer_id = list(records)[0][1]
data[customer_id] = {}
else:
# we've found customer records
# add each key/value pair (split from ';')
# to the customer record from above
for k, v in records:
data[customer_id][k] = v
return data
輸出:
>>> read_customer_records("foo.txt")
{'123': {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}, '124': {'Total sale amount': '245.45', 'Points until next bonus': ' 79', 'Points from sale': '27'}}
然后,您可以直接查找客戶; 例如:
>>> data = read_customer_records("foo.txt")
>>> data["123"]
{'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}
>>> data["123"]["Points until next bonus"]
' 77'
基本上我們在這里做的是基於Customer ID:
line對數據集進行“分組”。 然后我們創建一個數據結構( 一個dict
),然后我們可以輕松地進行O(1)
查找。
注意:只要您的“數據集”中的“客戶記錄”由Customer ID
分隔,無論客戶有多少“記錄”,這都將有效。 此實現還嘗試通過稍微清理輸入來盡可能多地處理“混亂”數據。
我會更普遍地接近這一點。 如果我沒有弄錯,有一個特定格式的記錄文件,記錄開始和結束**
。 為什么不這樣做呢?
records = file_content.split("**")
for each record in records:
if (record.split("\n")[0] == search):
customer_id = getCustomerIdFromRecord(record)
customer_dictionary.put(customer_id, record)
這將生成customer_id和最新記錄的映射。 您可以解析它以獲取所需的信息。
編輯:由於每條記錄總共有9行,您可以獲取文件中所有行的列表,並創建記錄列表,其中記錄將由9行列表表示。 您可以使用此處發布的答案:
您需要做的就是找到以Customer ID:123
開頭的行,當您發現它在內循環中循環遍歷文件對象時,直到找到Points until
行然后提取點。 points將是具有id的客戶的最后一次出現的最后一個值。
with open("test.txt") as f:
points = ""
for line in f:
if line.rstrip() == "Customer ID:123":
for line in f:
if line.startswith("Points until"):
points = line.rsplit(None, 1)[1]
break
print(points)
77
def get_points_until_next_bonus(filename, customerID):
#get the last "Customer ID":
last_id = open(filename, 'r').read().split('Customer ID:'+str(customerID))[-1]
#get the first line with Points until next bonus: 77
return last_id.split('Points until next bonus: ')[1].split('\n')[0]
#there you go...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.