简体   繁体   English

Python时间序列:将字典中的每日数据合并为每周数据

[英]Python Time series: merging daily data in dictionary to weekly data

I have a dictionary as below. 我有如下字典。

my_dict.keys() = 
dict_keys([20160101, 20160102, 20160103, 20160104, 20160105, 20160106,
       20160107, 20160108, 20160109, 20160110, 20160111, 20160112,
       20160113, 20160114, 20160115, 20160116, 20160117, 20160118,
       20160119, 20160120, 20160121, 20160122, 20160123, 20160124,
       ......    
       20171203, 20171204, 20171213, 20171215, 20171216, 20171217,
       20171218, 20171219, 20171220, 20171221, 20171222, 20171223,
       20171224, 20171225, 20171226, 20171227, 20171228, 20171229,
       20171230, 20171231])

my_dict[20160101] = 
array([[ 0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  2.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  2.],
       [ 0.,  0.,  4.,  0.,  0.,  0.]])

So, as you already notice that my keys are indicating dates and each date i have array with 6 by 6 floats. 因此,正如您已经注意到的那样,我的键指示日期,并且每个日期我都有6 x 6浮点数的数组。 In every keys in my_dict, all the indexes are same. 在my_dict的每个键中,所有索引都相同。

**Important thing to notice is that my_dict does not have every day. **要注意的重要一点是my_dict并非每天都有。 For example, after 20171204, its 20171213 and 20171215. So dates can be skipped. 例如,在20171204之后,其20171213和20171215。因此可以跳过日期。

Now my task is to get daily data (not every single day) to weekly data and add all the values within a week. 现在,我的任务是将每日数据(不是每天)转换为每周数据,并在一周之内添加所有值。 In other words, starting from the first week of 2016 to last week of 2017, add every values within a week and provide weekly data. 换句话说,从2016年的第一周到2017年的最后一周,将一周之内的所有值相加并提供每周数据。 Also, since first week of 2016 starts with 20160103 (Sun), I can disregard 20160101 and 20160102 data in my_dict as well as end week of 2017. Can you guys help me with this problem? 另外,由于2016年的第一周从20160103(星期日)开始,因此我可以忽略my_dict中的20160101和20160102数据以及2017年的结束一周。你们能帮我解决这个问题吗? Thanks in advance! 提前致谢!

-------edit--------- It seems like my question is not clear enough. -------编辑---------看来我的问题还不够清楚。 So I will provide a quick example. 因此,我将提供一个简单的示例。 Since I want to follow the standard of pandas datatime week, so each week starts with Sunday. 由于我想遵循大熊猫数据时间周的标准,因此每个星期都从星期日开始。 So first week of 2016 will be 20160103,20160104,20160105,20160106,20160107,20160108,201601‌​09. 所以2016年的第一周将是20160103,20160104,20160105,20160106,20160107,20160108,201601‌09。

So my new dictionary, weekly_dict[201601] <- where 201601 indicates the first week of 2016, all the values in key 20160103,20160104,20160105,20160106,20160107,20160108,201601‌​09 will be added and input as values. 因此,我的新字典weekly_dict [201601] <-其中201601表示2016年的第一周,将添加键20160103,20160104,20160105,20160106,20160107,20160108,201601‌09中的所有值并将其输入为值。

weekly_dict = {}
weekly_dict[201601] = my_dict[20160103] + my_dict[20160104] + my_dict[20160105] + my_dict[20160106] + my_dict[20160107] + my_dict[20160108] + my_dict[20160109]

And continues. 并继续。 Hope this makes sense. 希望这是有道理的。 Thanks! 谢谢!

This is probably a job for pandas: 这可能是熊猫的工作:

import pandas as pd

# First, get a list of keys
date_ints = list(my_dict)
# Turn them into a pandas Series object
date_int_series = pd.Series(date_ints)
# Cast them to a string, then format them into a full datetime-type with the proper
# format specification
datetime_series = pd.to_datetime(date_int_series.astype('str'), format='%Y%m%d')
# Create a dictionary mapping each date integer -> week of the year
date_int_to_week = dict(zip(date_int_series, datetime_series.dt.week))

This dictionary has each key of my_dict as a key, with its corresponding week of the year as its value. 该词典将my_dict每个键作为键,并将其每年的相应星期作为其值。

Edit: 编辑:

If what you're looking for is to sum each entry of your original dictionary based on week, you can do something like this: 如果您要查找的是根据周对原始词典的每个条目进行求和,则可以执行以下操作:

week_to_date_list = {}
for date_int, week in date_int_to_week.items():
    if week not in week_to_date_list:
        week_to_date_list[week] = []
    week_to_date_list[week].append(date_int)

my_dict_weekly = {}
for week in week_to_date_list:
    arrays_in_week = [my_dict[day_int] for day_int in week_to_date_list[week]]
    my_dict_weekly[week] = reduce(sum, arrays_in_week)

my_dict_weekly should now be a dictionary that has weeks of the year as its key, then the sum of all of the arrays corresponding to that week. 现在, my_dict_weekly应该是一个以一年中的几周作为关键字的字典,然后是与该周相对应的所有数组的sum If you're using python 3, you'll need to import reduce from functools . 如果您使用的是python 3,则需要从functools导入reduce

If i did understand well your question, i think that you can solve it using datetime and timedelta from datetime module like this example: 如果我确实理解您的问题,我认为您可以使用datetime模块中的datetimetimedelta解决此问题,例如以下示例:

from datetime import datetime, timedelta

def get_days_of_week(year, week=1):
    # number of the days
    days = {'Monday': 1, 'Tuesday': 2, 'Wednesday': 3, 
            'Thursday': 4, 'Friday': 5, 'Saturday': 6, 'Sunday': 7}
    # construct the datetime object with the year and the desired week
    a = datetime.strptime('{0}'.format(year), '%Y') + timedelta(days=7*(week-1))
    # Every week should start by Sunday .. So escaping days untill the first Sunday
    a += timedelta(days=7-days.get(a.strftime('%A'), 0))
    for k in range(0, 7):
        yield (a + timedelta(days=k)).strftime('%Y%m%d')

days = list(get_days_of_week(2016, week=1))
print('2016 / week = 1:', days)

days = list(get_days_of_week(2016, week=22))
print('2016 / week = 22:', days)

Output: 输出:

2016 / week = 1: 
 ['20160103',
 '20160104',
 '20160105',
 '20160106',
 '20160107',
 '20160108',
 '20160109']

2016 / week = 22: 
 ['20160529',
 '20160530',
 '20160531',
 '20160601',
 '20160602',
 '20160603',
 '20160604']

Edit: 编辑:

According to your last edit, this code may fulfill your needs: 根据您的上一次编辑,此代码可以满足您的需求:

from datetime import datetime, timedelta

def get_days_of_week(data):
    # number of the days
    days = {'Monday': 1, 'Tuesday': 2, 'Wednesday': 3,
            'Thursday': 4, 'Friday': 5, 'Saturday': 6, 'Sunday': 7}
    date = datetime.strptime('{}'.format(data), '%Y%m%d')
    # get week number
    week = int(date.strftime('%U'))
    # get year
    year = date.strftime('%Y')
    # construct the datetime object with the year and the desired week
    a = datetime.strptime(year, '%Y') + timedelta(days=7*week)
    # Every week should start by Synday .. So escaping days untill the first Sunday
    a += timedelta(days=7-days.get(a.strftime('%A'), 0))

    return {int(str(data)[:-2]): [int((a + timedelta(days=k)).strftime('%Y%m%d')) for k in range(0, 7)]}

week_dict = {}
week_dict.update(get_days_of_week(20160101))
week_dict.update(get_days_of_week(20160623))
print(week_dict[201601])
print(week_dict[201606])

print(week_dict)

Output: 输出:

[20160103, 20160104, 20160105, 20160106, 20160107, 20160108, 20160109]
[20160626, 20160627, 20160628, 20160629, 20160630, 20160701, 20160702]
{ 201601: [ 20160103,
            20160104,
            20160105,
            20160106,
            20160107,
            20160108,
            20160109],
  201606: [ 20160626,
            20160627,
            20160628,
            20160629,
            20160630,
            20160701,
            20160702]}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM