简体   繁体   中英

Build a dictionary of values in Python based on comparing a list of dates to dates in a dictionary

I have a list of dates (the last 30 days) that I build, and then I also have data returning from my database with dates and a count at those dates (I'll post some sample data after this description). I want to build a dictionary off of these two that will put in a placeholder value if the date is not returned from the database.

This is my list of dates - it also looks like this: http://screencast.com/t/VeB37A3k7KO

temp_dates = [
    datetime.date(2014, 4, 21),
    datetime.date(2014, 4, 22),
    datetime.date(2014, 4, 23),
    datetime.date(2014, 4, 24),
    ....
    datetime.date(2014, 5, 18),
    datetime.date(2014, 5, 19),
    datetime.date(2014, 5, 20),
    datetime.date(2014, 5, 21)
]

The data returned from my database is a list of dictionaries. It looks like this:

temp_data = [
    {u'daily_count': 3, u'total_count': 684, u'm_date': datetime.date(2014, 4, 21)},
    {u'daily_count': 2, u'total_count': 686, u'm_date': datetime.date(2014, 4, 22)},
    {u'daily_count': 32, u'total_count': 718, u'm_date': datetime.date(2014, 4, 23)},
    {u'daily_count': 1, u'total_count': 719, u'm_date': datetime.date(2014, 4, 25)},
    {u'daily_count': 1, u'total_count': 720, u'm_date': datetime.date(2014, 4, 26)},
    {u'daily_count': 17, u'total_count': 737, u'm_date': datetime.date(2014, 4, 29)},
    {u'daily_count': 1, u'total_count': 740, u'm_date': datetime.date(2014, 5, 2)},
    {u'daily_count': 1, u'total_count': 741, u'm_date': datetime.date(2014, 5, 4)},
    {u'daily_count': 1, u'total_count': 744, u'm_date': datetime.date(2014, 5, 6)},
    {u'daily_count': 2, u'total_count': 746, u'm_date': datetime.date(2014, 5, 8)}
    ...... etc.
]

I want to build a dictionary that will loop through the dates in temp_dates and if the date in temp_data matches, put the date as a new dictionary key with the total_count as the value. If there is a date that doesn't match then put in the previous value entered.

THIS IS WHAT I TRIED.

sql_info = {}
placeholder = 0

for i in temp_dates:
    for j in temp_data:
        if i == j['m_date']:
            sql_info[i] = j['total_count']
            placeholder = j['total_count']
            break
        else:
            sql_info[i] = placeholder

This doesn't work. It just puts in the placeholder every time, after putting in the first value on the first time through the loop. 684 http://screencast.com/t/BWUfFvYL

How can I fix this problem?


My working attempt

    for i in temp_dates:
        dd = i.strftime('%m-%d-%Y')
        sql_info[dd] = {}
        for j in temp_data:
            if i == j['m_date']:
                sql_info[dd]['total_count'] = j['total_count']
                placeholder = j['total_count']
                break
            else:
                if placeholder == 0:
                    placeholder = j['total_count'] - j['daily_count']
                sql_info[dd]['total_count'] = placeholder

If the date is not there the first time, calculate the total_count - daily_count to get the count that was there previously for that date. Expected output is this: http://screencast.com/t/0nCGTnAwJq ----- if there isn't a date there then I add it to the dict and put in the appropriate values (it's five different values per date that I put in).

Not fully sure if I get what you want but this keeps track of all placeholders and adds the second last value of total count using placeholder[-2] appends the previous value.

If you don't want the value to change until another date matches you can use a counter to keep track and use something like placeholder[-count]

sql_info = {}
placeholder = []
for i,j in zip(temp_data,temp_dates):
    placeholder.append(i['total_count'])
    if i['m_date'] in temp_dates:
        sql_info[j] = i['total_count']
    else:
        sql_info[j] = placeholder[-2]

This uses strftime to match your edited answer.

sql_info = {}
placeholder = []
count = 1
for i,j in zip(temp_data,temp_dates):
    dd = j.strftime('%m-%d-%Y')
    placeholder.append(i['total_count'])
    if i['m_date'] in temp_dates:
        sql_info[dd] = i['total_count']
    else:
        count += 1
        sql_info[dd] = placeholder[-count]
print sql_info

This is happening because you call "break" as soon as the function doesn't find i==j['m_date'] the first time.

In this example, because your first two values from i and j match each other, it will set placeholder 684 and then set it to sql_info[i] for the rest of the loop.

The best choice is probably to alter your query to only select rows that m_date is in your list.

However I think

import bisect
def get_date_count_dict(list_of_dates,dates_count_dict):
    dates_items = sorted(dates_count_dict.items(),key=lambda item:item[0])
    sorted_dates,sorted_counts = zip(*dates_items)
    return dict([(a_date,sorted_counts[bisect.bisect(sorted_dates,a_date)])for a_date in list_of_dates])

new_data = dict([(d['m_date'],d['total_count']) for d in temp_data])
final_data = get_date_count_dict(temp_dates,new_data)

should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM