简体   繁体   中英

How to group items by month and year using itertools.groupby()

Problem: I am trying to take a sorted list and group it based on the month and year but having trouble returning the grouped value correctly...

Assuming this data, we have a title and date/time list that has been ordered by datetime

lst = [
    {'title': 'in the past','date_time': datetime.datetime(2020, 3, 18, 0, 0)},
    {'title': 'Just another event','date_time': datetime.datetime(2020, 10, 1, 19, 7)},
    {'title': 'earlier today 9am','date_time': datetime.datetime(2020, 10, 21, 9, 0)},
    {'title': 'greater than .now()','date_time': datetime.datetime(2020, 10, 21, 23, 0)},
    {'title': 'another one','date_time': datetime.datetime(2020, 10, 30, 10, 0)},
    {'title': 'Me testing the latest event','date_time': datetime.datetime(2020, 10, 30, 12, 0)},
    {'title': '18 Nov 20','date_time': datetime.datetime(2020, 11, 18, 20, 27)},
    {'title': '18 January 2021','date_time': datetime.datetime(2021, 1, 18, 20, 0)},
    {'title': '18 March 21','date_time': datetime.datetime(2021, 3, 18, 20, 0)}
]

Then to group it, run it through itertools.groupby()

from itertools import groupby

def loop_tupe():
    diction = {}
    for key,group in groupby(lst, key=lambda x: (x['date_time'].month, x['date_time'].year)):
        for element in group:
            append_value(diction, key, element)
    return diction

After grouping it by the month and year the returned result looks like

{
    (3, 2020): {'title': 'in the past', 'date_time': datetime.datetime(2020, 3, 18, 0, 0)},
    (10, 2020): [
        {'title': 'Just another event', 'date_time': datetime.datetime(2020, 10, 1, 19, 7)},
        {'title': 'earlier today 9am', 'date_time': datetime.datetime(2020, 10, 21, 9, 0)},
        {'title': 'greater than .now()', 'date_time': datetime.datetime(2020, 10, 21, 23, 0)},
        {'title': 'another one', 'date_time': datetime.datetime(2020, 10, 30, 10, 0)},
        {'title': 'Me testing the latest event', 'date_time': datetime.datetime(2020, 10, 30, 12, 0)}
        ],
    (11, 2020): {'title': '18 Nov 20', 'date_time': datetime.datetime(2020, 11, 18, 20, 27)},
    (1, 2021): {'title': '18 January 2021', 'date_time': datetime.datetime(2021, 1, 18, 20, 0)},
    (3, 2021): {'title': '18 March 21', 'date_time': datetime.datetime(2021, 3, 18, 20, 0)}
}

It has been grouped correctly, however the dates are within a tuple whereas I would need them as one "value", and while it's in this format I am unable to loop over it in the way I would with the original list.

I realise it has something to do with the way I'm using the anonymous function within the groupby() (and maybe how the return result is created perhaps?) but I'm unsure how else to apply a month and year grouping within it.

Question: What can I do to group my original data by month & year while also keeping its format relatively similar to the list going in?


Edit

The append_value function that I'm using

def append_value(dict_obj, key, value):
    if key in dict_obj:
        if not isinstance(dict_obj[key], list):
            dict_obj[key] = [dict_obj[key]]
        dict_obj[key].append(value)
    else:
        dict_obj[key] = value

Edit 2

So far this is the closest I'm getting to a solution.

I have changed the function used in groupby to take the datetime and change it into a string to be compared. (I've left the print in there to visualise)

def loop_str(to_sort):
    output={}
    for key,group in groupby(to_sort, key=lambda item: item['date_time'].strftime('%B %Y')):
        for element in group:
            append_value(output,key,element)
    return output

Doing so gives me this output

{
    'March 2020': {
        'title': 'in the past', 'date_time': datetime.datetime(2020, 3, 18, 0, 0)
    },
    'October 2020': [
        {'title': 'Just another event', 'date_time': datetime.datetime(2020, 10, 1, 19, 7)},
        {'title': 'earlier today 9am', 'date_time': datetime.datetime(2020, 10, 21, 9, 0)},
        {'title': 'greater than .now()', 'date_time': datetime.datetime(2020, 10, 21, 23, 0)},
        {'title': 'another one', 'date_time': datetime.datetime(2020, 10, 30, 10, 0)},
        {'title': 'Me testing the latest event', 'date_time': datetime.datetime(2020, 10, 30, 12, 0)}
    ],
    'November 2020': {
        'title': '18 Nov 20', 'date_time': datetime.datetime(2020, 11, 18, 20, 27)
    },
    'January 2021': {
        'title': '18 January 2021', 'date_time': datetime.datetime(2021, 1, 18, 20, 0)
    },
    'March 2021': {
        'title': '18 March 21', 'date_time': datetime.datetime(2021, 3, 18, 20, 0)
    }
}

This is closer to what I need however unless I'm not seeing something it seems that this output could be a mix of dicts and lists which will be more difficult to loop over within a django template?

You can make the groupby key the string you want my formatting the date. Then you can just use it in a dict comprehension. It is easier to create the data structure if the values are constantly lists. It will probably also be easier to use it.

from itertools import groupby
import datetime

lst = [
    {'title': 'in the past','date_time': datetime.datetime(2020, 3, 18, 0, 0)},
    {'title': 'Just another event','date_time': datetime.datetime(2020, 10, 1, 19, 7)},
    {'title': 'earlier today 9am','date_time': datetime.datetime(2020, 10, 21, 9, 0)},
    {'title': 'greater than .now()','date_time': datetime.datetime(2020, 10, 21, 23, 0)},
    {'title': 'another one','date_time': datetime.datetime(2020, 10, 30, 10, 0)},
    {'title': 'Me testing the latest event','date_time': datetime.datetime(2020, 10, 30, 12, 0)},
    {'title': '18 Nov 20','date_time': datetime.datetime(2020, 11, 18, 20, 27)},
    {'title': '18 January 2021','date_time': datetime.datetime(2021, 1, 18, 20, 0)},
    {'title': '18 March 21','date_time': datetime.datetime(2021, 3, 18, 20, 0)}
]

groups = groupby(lst, key=lambda x: (x['date_time'].strftime("%B %Y")))

{k: list(g) for k, g in groups}

Result:

{'March 2020': [{'title': 'in the past',
   'date_time': datetime.datetime(2020, 3, 18, 0, 0)}],
 'October 2020': [{'title': 'Just another event',
   'date_time': datetime.datetime(2020, 10, 1, 19, 7)},
  {'title': 'earlier today 9am',
   'date_time': datetime.datetime(2020, 10, 21, 9, 0)},
  {'title': 'greater than .now()',
   'date_time': datetime.datetime(2020, 10, 21, 23, 0)},
  {'title': 'another one',
   'date_time': datetime.datetime(2020, 10, 30, 10, 0)},
  {'title': 'Me testing the latest event',
   'date_time': datetime.datetime(2020, 10, 30, 12, 0)}],
 'November 2020': [{'title': '18 Nov 20',
   'date_time': datetime.datetime(2020, 11, 18, 20, 27)}],
 'January 2021': [{'title': '18 January 2021',
   'date_time': datetime.datetime(2021, 1, 18, 20, 0)}],
 'March 2021': [{'title': '18 March 21',
   'date_time': datetime.datetime(2021, 3, 18, 20, 0)}]}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM