[英]Python: converting a list of tuples to dictionary with some conditions
我創建了一個像這樣的列表:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),...]
我想將其轉換成這樣的字典:
bookDict = { 24: {'Start': '2008-10-30', 'End': '2008-12-20','reason':'sold'},
25: {'Start': '2009-01-01', 'End': '2009-11-14','reason':'returned'},
26: {'Start': '2010-04-03', 'End': '2010-10-11','reason':'sold'},...}
對於作為Book列表中元組的第一個值的字典中的每個鍵(這是一個代碼),我希望有兩個元組作為每個鍵的值。 其中一個與該特定代碼的“開始”點有關,另一個與該特定代碼的“結束”點有關。
我還有一個問題。 對於某些代碼,有多個日期不同的“終點”。 我只想保留終點,並保留較晚的日期。 像這樣的事情:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'),
(24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
對於上面的示例字典應保留以下內容:
bookDict = { 24: {'Start': '2008-10-30', 'End': '2009-11-25','reason':'sold'},
誰能幫我嗎?
您可以使用itertools.groupby
, min
和max
:
import itertools
def quantity_key(d):
return list(map(int, d[1].split('-')))
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'), (25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'), (26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}
輸出:
{24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'}, 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'}, 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'sold'}}
每個鍵有兩個以上的值:
Book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End', 'sold'), (24, '2009-02-04', 'End', 'sold'), (24, '2009-11-25', 'End', 'sold')]
new_books = {a:list(b) for a, b in itertools.groupby(Book, key=lambda x:x[0])}
final_books = {a:{'Start':min(b, key=quantity_key)[1], 'End':max(b, key=quantity_key)[1], 'reason':max(b, key=quantity_key)[-1]} for a, b in new_books.items()}
輸出:
{24: {'Start': '2008-10-30', 'End': '2009-11-25', 'reason': 'sold'}}
這是一個滿足兩個條件的解決方案。
每當它收到新書ID時,它都會為其創建dict
並在遇到list
數據時將其填充。
對於多個結束條目,您的日期格式允許使用字符串比較來獲取最新日期。
books = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold'),
(26, '2011-10-11', 'End', 'returned')] # The latest 'End' entry should be picked
bookDict = {}
for info in books:
id_ = info[0]
type_ = info[2]
book = bookDict.setdefault(id_, {})
if type_ == 'Start':
book[type_] = info[1]
elif type_ == 'End' and info[1] > book.get(type_, ''):
book[type_] = info[1]
book['reason'] = info[3]
輸出:
bookDict
# {24: {'Start': '2008-10-30', 'End': '2008-12-20', 'reason': 'sold'},
# 25: {'Start': '2009-01-01', 'End': '2009-11-14', 'reason': 'returned'},
# 26: {'Start': '2010-04-03', 'End': '2010-10-11', 'reason': 'returned'}}
您可以執行以下操作:
for t in Book:
index, date, marker, *rest = t
entry = d.setdefault(index, {})
end_date = entry.get("End", "1900-01-01")
if marker == "Start" or date > end_date:
entry[marker] = date
if rest:
entry["reason"] = rest[0]
盡管它可以適用於第二部分,但這僅回答了OP的第一部分問題。
您可以將collections.defaultdict
用於O(n)解決方案:
book = [(24, '2008-10-30', 'Start'), (24, '2008-12-20', 'End','sold'),
(25, '2009-01-01', 'Start'), (25, '2009-11-14', 'End', 'returned'),
(26, '2010-04-03', 'Start'), (26, '2010-10-11', 'End', 'sold')]
from collections import defaultdict
d = defaultdict(dict)
for key, date, *data in book:
d[key][data[0]] = date
if len(data) == 2:
d[key]['reason'] = data[1]
另外,您可以捕獲IndexError
而不是測試元IndexError
度:
for key, date, *data in book:
d[key][data[0]] = date
try:
d[key]['reason'] = data[1]
except IndexError:
continue
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.