[英]Python: get most recent date per company
我有一個元組列表,其中包含日期和公司名稱。 公司可以列出多個日期的信息:
[(Company A, datetime.date(1980,1,30)),
(Company A, datetime.date(1990,1,30)),
(Company B, datetime.date(1990,1,30)),
(Company B, datetime.date(2000,1,30))]
我想要做的是有一個列表,其中僅包含每個公司的最新日期,即結果:
[(Company A, datetime.date(1990,1,30)),
(Company B, datetime.date(2000,1,30))]
有任何想法嗎?
如何從itertools使用groupby
,然后取最大:
import datetime
x = [('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30))]
import itertools
out = []
for k,g in itertools.groupby(sorted(x, key = lambda y: y[0]), lambda y: y[0]):
out.append(max(g, key = lambda y:y[1]))
out
[('Company A', datetime.date(1990, 1, 30)),
('Company B', datetime.date(2000, 1, 30))]
您也可以使用字典...
data = [('Company A', '1980,1,30'),
('Company A', '1990,1,30'),
('Company B', '1990,1,30'),
('Company B', '2000,1,30')]
datadict = { a:b for a,b in data }
for a, b in data:
datadict[a] = max(b, datadict[a])
print(datadict)
這是使用reduce()
的示例:
import datetime
company_dates = [
('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30)),
]
def reducer(acc, company_date):
try:
acc[company_date[0]] = max(acc[company_date[0]], company_date[1])
except KeyError:
acc[company_date[0]] = company_date[1]
return acc
sorted = reduce(reducer, company_dates, {})
print sorted.items()
這是另一個使用不同功能的替代解決方案:
import datetime
import operator
company_dates = [
('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30)),
]
sorted = sorted(company_dates, key=operator.itemgetter(0, 1), reverse=True)
unique = set([company_date[0] for company_date in sorted])
top = [next(c for c in sorted if c[0] == company) for company in unique]
print top
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.