简体   繁体   中英

python itertools.groupby datetime series by hour

The task is pretty simple and i'm able to partially accomplish it:

from dateutil.parser import parse

    for timestamp, grp in itertools.groupby(transactions, lambda x: parse(x['date']).hour):
        group = list(grp)
        logger.info(f'{timestamp} : {len(group)}')

-> i get the hour:count array.

However i want to have a datetime:count array as a result (where datetime object represent one hour).

Does i have to build the datetime object in lambda x function? (ie get x['date']).hour , x['date']).day , x['date']).month etc. and create a new datetime using those values) or there is another way?

sample input(transactions) contains data for weeks/months:

[
 {
  'date': '2018-12-04T15:34:40+00:00',
  'data': 'blabla'
 },
 {
  'date': '2018-12-04T15:38:40+00:00',
  'data': 'blabla'
 },
 {
  'date': '2018-12-04T15:45:40+00:00',
  'data': 'blabla'
 },
]

sample output:

2018-12-04 13:00:00+00:00 : 6
2018-12-04 14:00:00+00:00 : 1
2018-12-04 15:00:00+00:00 : 2

Thank you

You need to use a datetime object as key, I used now as the date but you can use the original date:

import itertools
import datetime
from dateutil.parser import parse

transactions = ['2018-12-04 13:{}0:00+00:00'.format(i) for i in range(6)] + \
               ['2018-12-04 14:{}0:00+00:00'.format(i) for i in range(1)] + \
               ['2018-12-04 15:{}0:00+00:00'.format(i) for i in range(2)]
for timestamp, grp in itertools.groupby(transactions, key=lambda x: datetime.datetime.combine(parse(x).date(), datetime.time(parse(x).hour, 0, 0, 0))):
    count = list(grp)
    print('{}:{}'.format(timestamp, len(count)))

Output

2018-12-04 13:00:00:6
2018-12-04 14:00:00:1
2018-12-04 15:00:00:2

Since I'm a collections.Counter fan:

from dateutils.parser import parse
from collections import Counter

counter = Counter()
for transaction in transactions:
    counter.update(str(parse(transaction['date'][:-12])),)
for key,value in sorted(counter.items()):
    print(f'{key}: {value}')

And here is a solution using list comprehension:

from dateutils.parser import parse
from collections import Counter

counter = Counter([str(parse(x['date'][:-12])) for x in transactions]) 

for key,value in sorted(counter.items()):
    print(f'{key}: {value}')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM