简体   繁体   English

Python:获取每个公司的最新日期

[英]Python: get most recent date per company

I have a list of tuples, which consists of a date and a name of a company. 我有一个元组列表,其中包含日期和公司名称。 A company can have info listed for multiple dates: 公司可以列出多个日期的信息:

 [(Company A, datetime.date(1980,1,30)),
  (Company A, datetime.date(1990,1,30)),
  (Company B, datetime.date(1990,1,30)),
  (Company B, datetime.date(2000,1,30))]

What I want to do, is have a list that only includes the most recent date available for each company, ie the result: 我想要做的是有一个列表,其中仅包含每个公司的最新日期,即结果:

 [(Company A, datetime.date(1990,1,30)),
  (Company B, datetime.date(2000,1,30))]

Any ideas? 有任何想法吗?

how about using a groupby from itertools, then taking the max: 如何从itertools使用groupby ,然后取最大:

import datetime
x = [('Company A', datetime.date(1980,1,30)),
  ('Company A', datetime.date(1990,1,30)),
  ('Company B', datetime.date(1990,1,30)),
  ('Company B', datetime.date(2000,1,30))]

import itertools
out = []
for k,g in itertools.groupby(sorted(x, key = lambda y: y[0]), lambda y: y[0]):
    out.append(max(g, key = lambda y:y[1]))

out
[('Company A', datetime.date(1990, 1, 30)),
 ('Company B', datetime.date(2000, 1, 30))]

You could also use a dictionary ... 您也可以使用字典...

data = [('Company A', '1980,1,30'),
  ('Company A', '1990,1,30'),
  ('Company B', '1990,1,30'),
  ('Company B', '2000,1,30')]

datadict = { a:b for a,b in data }

for a, b in data:
    datadict[a] = max(b, datadict[a])

print(datadict)

Here's an example using reduce() : 这是使用reduce()的示例:

import datetime

company_dates = [
  ('Company A', datetime.date(1980,1,30)),
  ('Company A', datetime.date(1990,1,30)),
  ('Company B', datetime.date(1990,1,30)),
  ('Company B', datetime.date(2000,1,30)),
]

def reducer(acc, company_date):
  try:
    acc[company_date[0]] = max(acc[company_date[0]], company_date[1])
  except KeyError:
    acc[company_date[0]] = company_date[1]

  return acc

sorted = reduce(reducer, company_dates, {})

print sorted.items()

Here's another alternative solution using different functions: 这是另一个使用不同功能的替代解决方案:

import datetime
import operator

company_dates = [
  ('Company A', datetime.date(1980,1,30)),
  ('Company A', datetime.date(1990,1,30)),
  ('Company B', datetime.date(1990,1,30)),
  ('Company B', datetime.date(2000,1,30)),
]

sorted = sorted(company_dates, key=operator.itemgetter(0, 1), reverse=True)
unique = set([company_date[0] for company_date in sorted])
top = [next(c for c in sorted if c[0] == company) for company in unique]

print top

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM