[英]Python: get most recent date per company
我有一个元组列表,其中包含日期和公司名称。 公司可以列出多个日期的信息:
[(Company A, datetime.date(1980,1,30)),
(Company A, datetime.date(1990,1,30)),
(Company B, datetime.date(1990,1,30)),
(Company B, datetime.date(2000,1,30))]
我想要做的是有一个列表,其中仅包含每个公司的最新日期,即结果:
[(Company A, datetime.date(1990,1,30)),
(Company B, datetime.date(2000,1,30))]
有任何想法吗?
如何从itertools使用groupby
,然后取最大:
import datetime
x = [('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30))]
import itertools
out = []
for k,g in itertools.groupby(sorted(x, key = lambda y: y[0]), lambda y: y[0]):
out.append(max(g, key = lambda y:y[1]))
out
[('Company A', datetime.date(1990, 1, 30)),
('Company B', datetime.date(2000, 1, 30))]
您也可以使用字典...
data = [('Company A', '1980,1,30'),
('Company A', '1990,1,30'),
('Company B', '1990,1,30'),
('Company B', '2000,1,30')]
datadict = { a:b for a,b in data }
for a, b in data:
datadict[a] = max(b, datadict[a])
print(datadict)
这是使用reduce()
的示例:
import datetime
company_dates = [
('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30)),
]
def reducer(acc, company_date):
try:
acc[company_date[0]] = max(acc[company_date[0]], company_date[1])
except KeyError:
acc[company_date[0]] = company_date[1]
return acc
sorted = reduce(reducer, company_dates, {})
print sorted.items()
这是另一个使用不同功能的替代解决方案:
import datetime
import operator
company_dates = [
('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30)),
]
sorted = sorted(company_dates, key=operator.itemgetter(0, 1), reverse=True)
unique = set([company_date[0] for company_date in sorted])
top = [next(c for c in sorted if c[0] == company) for company in unique]
print top
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.