简体   繁体   中英

How to count consecutive days in a list of dates

# below is the example of list of dates:
date_list = [date(2021, 1, 2), date(2021, 1, 3), date(2021, 1, 5)]

I was thinking of counting all consecutive dates and make a dictionary where the key value is the start date and the value is the count of consecutive days.

dic = {date(2021, 1, 2): 2, date(2021, 1, 5): 1}

Could anyone help me what steps I should take to accomplish the above task? Should I approach the problem in a different way? Thank you so much.

This is more like an algorithm problem, here I list the codes with some test cases

from datetime import date
from datetime import timedelta


def is_consecutive(date1, date2):
    return True if date1 + timedelta(days=1) == date2 else False


def my_func(date_list):
    if not date_list:
        return {}
    if len(date_list) == 1:
        return {date_list[0]: 1}

    date_list.sort()
    res = {}
    start_date = date_list[0]
    cnt = 1
    for idx, cur_date in enumerate(date_list[1:], start=1):
        # print(idx, cur_date)
        if is_consecutive(date_list[idx - 1], cur_date):
            cnt += 1
        else:
            res[start_date] = cnt
            start_date = cur_date
            cnt = 1
    else:
        res[start_date] = cnt
    return res


if __name__ == "__main__":
    # below is the example of list of dates:
    date_list = [date(2021, 1, 2), date(2021, 1, 3), date(2021, 1, 5)]
    # {datetime.date(2021, 1, 2): 2, datetime.date(2021, 1, 5): 1}
    print(my_func(date_list))

    date_list = [date(2021, 1, 5), date(2021, 1, 2), date(2021, 1, 3), date(2021, 1, 6), date(2021, 1, 31),
                 date(2021, 2, 1), date(2021, 2, 2)]
    # {datetime.date(2021, 1, 2): 2, datetime.date(2021, 1, 5): 2, datetime.date(2021, 1, 31): 3}
    print(my_func(date_list))

    date_list = [date(2021, 1, 5)]
    # {datetime.date(2021, 1, 5): 1}
    print(my_func(date_list))

    date_list = []
    # {}
    print(my_func(date_list))

There are different ways to solve your problem. In this case a simple and effective approach is to just have a for loop where you check if the condition (the dates are consecutives) applies to the current loop item and the "ongoing" group of dates:

from datetime import date

date_list = [date(2021, 1, 2), date(2021, 1, 3), date(2021, 1, 5)]

result = {}

c = date_list[0]
n = 1

for d in date_list[1:]:
    if (d - c).days == n:
        n = n + 1
    else:
        result[c] = n
        c = d
        n = 1

result[c] = n

print(result)

We start with a "group" in which we place the first date and a counter n for the number of elements in the group. Then we start the loop (from the second item). At each iteration we check the condition: is the current date n days in the future from the first date in the group? If yes we increment n and go on, else we store the data in the dictionary. Note that I am talking about a "group" of dates but you don't need to store all of them: the first one and n (the number of days spanned by the group) is sufficient here.

Important: when using this approach you always need to account for the fact the last group of items need to be processed outside the loop.

As I said there are other approaches: I choose this one because, IMHO, it is quite easy to understand how it works.

You need to loop over date_list and compare each date to the previous one to see if they are consecutive days. Assuming you are using the built-in datetime.date class you can use the toordinal() method to make comparing dates easy. You can also use a timedelta of 1 day. You will want to keep track of the first date in a chain on consecutive dates so you can continue to access it in the dictionary.

The following code should work assuming your date list is sorted.

from datetime import date
date_list = [date(2021, 1, 2), date(2021, 1, 3), date(2021, 1, 5)]
dic = {}

date_range_start = None
for i in range(len(date_list)):
    if i == 0:
        date_range_start = date_list[i]
        dic[date_range_start] = 1
    else:
        if date_list[i].toordinal() - date_list[i-1].toordinal() == 1:
            dic[date_range_start] += 1
        else:
            date_range_start = date_list[i]
            dic[date_range_start] = 1

I would do it like this: First sort the list by doing date_list.sort() , this is rather important since you will not end up with the correct result if it is not sorted.

Then just use a for loop to iterate the entire list, keep track of the previous day by setting up a variable current_date , compare the current day with the previous day and current day, if it is consecutive then increment the counter such as current_max_consecutive_day else reset the previous date with the new date and the counter.

from datetime import date

date_list = [date(2021, 1, 2), date(2021, 1, 3), date(2021, 1, 5)]
date_list.sort() # if the list is not sorted, then please sort it

dic = {}
current_max_consecutive_day = 1
current_date = date_list[0]

# This only work if the list is sorted
for d in date_list:
    if d.day-current_date.day == 1:
        # if we found a consecutive day then add it to the dictionary
        current_max_consecutive_day += 1
    else:
        # otherwise create a new item in the dic and restart the count
        current_max_consecutive_day = 1
        current_date = d
    dic[current_date] = current_max_consecutive_day

    

You can convert your list into a dictionary with a count of one for each date. Then merge the dates that are consecutive using each day as a starting point:

from datetime import date,timedelta

date_list = [date(2021, 1, 3), date(2021, 1, 2), date(2021, 1, 5)]

dic = dict.fromkeys(date_list,1) # start with each date = span of 1
for d in date_list:              # try to merge each date with next ones
    c = d
    while c in dic:                    # get consecutive dates starting from d
        if c>d: dic[d] += dic.pop(c)   # merge/remove later date
        c += timedelta(days=1)         # check further 
        
print(dic)
# {datetime.date(2021, 1, 2): 2, datetime.date(2021, 1, 5): 1}

This will work even when the dates are not in ascending order so you don't need to sort and will get O(n) time complexity

Note that this solution does not take into account the possibility of duplicate dates in the list (ie each date only counts for 1 even if there are multiple instances in the list)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM