簡體   English   中英

Python - 搜索ID的txt.file,然后從下面的行返回變量

[英]Python - search txt.file for ID, then return variable from line below

在Python中,我正在嘗試(非常糟糕)讀取.txt文件,找到引用特定客戶的最后一個字符串,並在下面讀幾行以獲得當前的點余額。

.txt文件的快照是:

Customer ID:123
Total sale amount:2345.45

Points from sale:23
Points until next bonus: 77

我可以搜索(並查找)特定的客戶ID,但無法弄清楚如何僅搜索此ID的最后一次出現,或者如何返回“直到下一次獎勵的點數”值......我為此道歉是一個基本問題,但任何幫助將不勝感激!

我的代碼到目前為止......

def reward_points():

#current points total
rewards = open('sales.txt', 'r')

line = rewards.readlines()
search = (str('Customer ID:') + str(Cust_ID))
print(search) #Customer ID:123

while line != ' ':
    if line.startswith(search):
        find('Points until next bonus:')
        current_point_total = line[50:52]
        cust_record = rewards.readlines()
        print(current_point_total)


rewards.close()

獎勵分數()

我認為你最好將文件解析為結構化數據,而不是試圖尋找文件,這不是一種特別方便的文件格式。

這是一種建議的方法

readline迭代文件

通過匹配':'將行拆分為字段和標簽

將代表客戶的字段和標簽放入字典中

將代表客戶的字典放入另一個字典中

然后,您有一個內存數據庫,您可以通過dict查找取消引用

例如customers['1234']['Points until next bonus']

這是這種方法的簡化示例代碼

#!/usr/bin/env python
import re

# dictionary with all the customers in 
customers = dict()

with open("sales.txt") as f:
    #one line at a time
    for line in f:
        #pattern match on 'key : value'
        field_match = re.match('^(.*):(.*)$',line)

        if field_match :
            # store the fields in variables
            (key,value) = field_match.groups()
            # Customer ID means a new record
            if key == "Customer ID" :
                # set a key for the 'customers database'
                current_id = value
                # if we have never seen this id before it's the first, make a record
                if customers.get(current_id) == None :
                    customers[current_id] = []
                # make the record an ordered list of dicts for each block
                customers[current_id].append(dict())
            # not a new record, so store the key and value in the dictionary at the end of the list
            customers[current_id][-1][key] = value

# now customers is a "database" indexed on customer id
#  where the values are a list of dicts of each data block
#
# -1 indexes the last of the list
# so the last customer's record for "123" is 

print customers["123"][-1]["Points until next bonus"]

更新

我沒有意識到你有多個客戶塊,並且對訂購感興趣,所以我重新設計了示例代碼,以保留每個數據塊的有序列表,並根據客戶ID進行解析

這是itertools.groupby()一個很好的用例,這個用例非常適合這種模式:

例:

from itertools import groupby, ifilter, imap


def search(d):
    """Key function used to group our dataset"""

    return d[0] == "Customer ID"


def read_customer_records(filename):
    """Read customer records and return a nicer data structure"""

    data = {}

    with open(filename, "r") as f:
        # clean adn remove blank lines
        lines = ifilter(None, imap(str.strip, f))

        # split each line on the ':' token
        lines = (line.split(":", 1) for line in lines)

        # iterate through each customer and their records
        for newcustomer, records in groupby(lines, search):
            if newcustomer:
                # we've found a new customer
                # create a new dict against their customer id
                customer_id = list(records)[0][1]
                data[customer_id] = {}
            else:
                # we've found customer records
                # add each key/value pair (split from ';')
                # to the customer record from above
                for k, v in records:
                    data[customer_id][k] = v

    return data

輸出:

>>> read_customer_records("foo.txt")
{'123': {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}, '124': {'Total sale amount': '245.45', 'Points until next bonus': ' 79', 'Points from sale': '27'}}

然后,您可以直接查找客戶; 例如:

>>> data = read_customer_records("foo.txt")
>>> data["123"]
{'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}
>>> data["123"]["Points until next bonus"]
' 77'

基本上我們在這里做的是基於Customer ID: line對數據集進行“分組”。 然后我們創建一個數據結構( 一個dict ),然后我們可以輕松地進行O(1)查找。

注意:只要您的“數據集”中的“客戶記錄”由Customer ID分隔,無論客戶有多少“記錄”,這都將有效。 此實現還嘗試通過稍微清理輸入來盡可能多地處理“混亂”數據。

我會更普遍地接近這一點。 如果我沒有弄錯,有一個特定格式的記錄文件,記錄開始和結束** 為什么不這樣做呢?

records = file_content.split("**")
for each record in records:
    if (record.split("\n")[0] == search):
        customer_id = getCustomerIdFromRecord(record)
        customer_dictionary.put(customer_id, record)

這將生成customer_id和最新記錄的映射。 您可以解析它以獲取所需的信息。

編輯:由於每條記錄總共有9行,您可以獲取文件中所有行的列表,並創建記錄列表,其中記錄將由9行列表表示。 您可以使用此處發布的答案:

將List轉換為元組python列表

您需要做的就是找到以Customer ID:123開頭的行,當您發現它在內循環中循環遍歷文件對象時,直到找到Points until行然后提取點。 points將是具有id的客戶的最后一次出現的最后一個值。

with open("test.txt") as f:
    points = ""
    for line in f:
        if line.rstrip() == "Customer ID:123":
            for line in f:
                if line.startswith("Points until"):
                    points = line.rsplit(None, 1)[1]
                    break

print(points)
77
def get_points_until_next_bonus(filename, customerID):
    #get the last "Customer ID":
    last_id = open(filename, 'r').read().split('Customer ID:'+str(customerID))[-1]
    #get the first line with Points until next bonus: 77
    return last_id.split('Points until next bonus: ')[1].split('\n')[0]
    #there you go...

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM