Python - 搜索ID的txt.file，然后從下面的行返回變量

Question

在Python中，我正在嘗試（非常糟糕）讀取.txt文件，找到引用特定客戶的最后一個字符串，並在下面讀幾行以獲得當前的點余額。

.txt文件的快照是：

Customer ID:123
Total sale amount:2345.45

Points from sale:23
Points until next bonus: 77

我可以搜索（並查找）特定的客戶ID，但無法弄清楚如何僅搜索此ID的最后一次出現，或者如何返回“直到下一次獎勵的點數”值......我為此道歉是一個基本問題，但任何幫助將不勝感激！

我的代碼到目前為止......

def reward_points（）：

#current points total
rewards = open('sales.txt', 'r')

line = rewards.readlines()
search = (str('Customer ID:') + str(Cust_ID))
print(search) #Customer ID:123

while line != ' ':
    if line.startswith(search):
        find('Points until next bonus:')
        current_point_total = line[50:52]
        cust_record = rewards.readlines()
        print(current_point_total)


rewards.close()

獎勵分數（）

Answer 1

我認為你最好將文件解析為結構化數據，而不是試圖尋找文件，這不是一種特別方便的文件格式。

這是一種建議的方法

用readline迭代文件

通過匹配'：'將行拆分為字段和標簽

將代表客戶的字段和標簽放入字典中

將代表客戶的字典放入另一個字典中

然后，您有一個內存數據庫，您可以通過dict查找取消引用

例如customers['1234']['Points until next bonus']

這是這種方法的簡化示例代碼

#!/usr/bin/env python
import re

# dictionary with all the customers in 
customers = dict()

with open("sales.txt") as f:
    #one line at a time
    for line in f:
        #pattern match on 'key : value'
        field_match = re.match('^(.*):(.*)$',line)

        if field_match :
            # store the fields in variables
            (key,value) = field_match.groups()
            # Customer ID means a new record
            if key == "Customer ID" :
                # set a key for the 'customers database'
                current_id = value
                # if we have never seen this id before it's the first, make a record
                if customers.get(current_id) == None :
                    customers[current_id] = []
                # make the record an ordered list of dicts for each block
                customers[current_id].append(dict())
            # not a new record, so store the key and value in the dictionary at the end of the list
            customers[current_id][-1][key] = value

# now customers is a "database" indexed on customer id
#  where the values are a list of dicts of each data block
#
# -1 indexes the last of the list
# so the last customer's record for "123" is 

print customers["123"][-1]["Points until next bonus"]

更新

我沒有意識到你有多個客戶塊，並且對訂購感興趣，所以我重新設計了示例代碼，以保留每個數據塊的有序列表，並根據客戶ID進行解析

Answer 2

這是itertools.groupby()一個很好的用例，這個用例非常適合這種模式：

例：

from itertools import groupby, ifilter, imap


def search(d):
    """Key function used to group our dataset"""

    return d[0] == "Customer ID"


def read_customer_records(filename):
    """Read customer records and return a nicer data structure"""

    data = {}

    with open(filename, "r") as f:
        # clean adn remove blank lines
        lines = ifilter(None, imap(str.strip, f))

        # split each line on the ':' token
        lines = (line.split(":", 1) for line in lines)

        # iterate through each customer and their records
        for newcustomer, records in groupby(lines, search):
            if newcustomer:
                # we've found a new customer
                # create a new dict against their customer id
                customer_id = list(records)[0][1]
                data[customer_id] = {}
            else:
                # we've found customer records
                # add each key/value pair (split from ';')
                # to the customer record from above
                for k, v in records:
                    data[customer_id][k] = v

    return data

輸出：

>>> read_customer_records("foo.txt")
{'123': {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}, '124': {'Total sale amount': '245.45', 'Points until next bonus': ' 79', 'Points from sale': '27'}}

然后，您可以直接查找客戶; 例如：

>>> data = read_customer_records("foo.txt")
>>> data["123"]
{'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}
>>> data["123"]["Points until next bonus"]
' 77'

基本上我們在這里做的是基於Customer ID: line對數據集進行“分組”。 然后我們創建一個數據結構（ 一個dict ），然后我們可以輕松地進行O(1)查找。

注意：只要您的“數據集”中的“客戶記錄”由Customer ID分隔，無論客戶有多少“記錄”，這都將有效。 此實現還嘗試通過稍微清理輸入來盡可能多地處理“混亂”數據。

Answer 3

我會更普遍地接近這一點。 如果我沒有弄錯，有一個特定格式的記錄文件，記錄開始和結束** 。 為什么不這樣做呢？

records = file_content.split("**")
for each record in records:
    if (record.split("\n")[0] == search):
        customer_id = getCustomerIdFromRecord(record)
        customer_dictionary.put(customer_id, record)

這將生成customer_id和最新記錄的映射。 您可以解析它以獲取所需的信息。

編輯：由於每條記錄總共有9行，您可以獲取文件中所有行的列表，並創建記錄列表，其中記錄將由9行列表表示。 您可以使用此處發布的答案：

將List轉換為元組python列表

Answer 4

您需要做的就是找到以Customer ID:123開頭的行，當您發現它在內循環中循環遍歷文件對象時，直到找到Points until行然后提取點。 points將是具有id的客戶的最后一次出現的最后一個值。

with open("test.txt") as f:
    points = ""
    for line in f:
        if line.rstrip() == "Customer ID:123":
            for line in f:
                if line.startswith("Points until"):
                    points = line.rsplit(None, 1)[1]
                    break

print(points)
77

Answer 5

def get_points_until_next_bonus(filename, customerID):
    #get the last "Customer ID":
    last_id = open(filename, 'r').read().split('Customer ID:'+str(customerID))[-1]
    #get the first line with Points until next bonus: 77
    return last_id.split('Points until next bonus: ')[1].split('\n')[0]
    #there you go...

Python - 搜索ID的txt.file，然后從下面的行返回變量

問題描述

5 個解決方案

解決方案1
2 2015-05-23 08:57:25

解決方案2
1 2015-05-23 09:49:12

解決方案3
0 2015-05-23 08:59:04

解決方案4
0 2015-05-23 09:45:36

解決方案5
0 2015-05-23 17:50:12

Python - 搜索ID的txt.file，然后從下面的行返回變量

問題描述

5 個解決方案

解決方案1 2 2015-05-23 08:57:25

解決方案2 1 2015-05-23 09:49:12

解決方案3 0 2015-05-23 08:59:04

解決方案4 0 2015-05-23 09:45:36

解決方案5 0 2015-05-23 17:50:12

解決方案1
2 2015-05-23 08:57:25

解決方案2
1 2015-05-23 09:49:12

解決方案3
0 2015-05-23 08:59:04

解決方案4
0 2015-05-23 09:45:36

解決方案5
0 2015-05-23 17:50:12