简体   繁体   English

根据小时和星期几插入数据

[英]Inserting data based on hour and day of the week

I have a set of hourly data taken from 07-Feb-19 to 17-Feb-19:我有一组从 19 年 2 月 7 日到 19 年 2 月 17 日的每小时数据:

                             t     v_amm     v_alc     v_no2
0    2019-02-07 08:00:00+00:00  0.320000  0.344000  1.612000
1    2019-02-07 09:00:00+00:00  0.322889  0.391778  1.580889
2    2019-02-07 10:00:00+00:00  0.209375  0.325208  2.371250
...
251  2019-02-17 19:00:00+00:00  1.082041  0.652041  0.967143
252  2019-02-17 20:00:00+00:00  0.936923  0.598654  1.048077
253  2019-02-17 21:00:00+00:00  0.652553  0.499574  1.184894

and another similar set of hourly data taken from 01-Mar-19 to 11-Mar-19:以及从 19 年 3 月 1 日到 2019 年 3 月 11 日的另一组类似的每小时数据:

                            t     v_amm     v_alc     v_no2
0   2019-03-01 00:00:00+00:00  0.428222  0.384444  1.288222
1   2019-03-01 01:00:00+00:00  0.398600  0.359600  1.325800
2   2019-03-01 02:00:00+00:00  0.365682  0.352273  1.360000
...
244 2019-03-11 04:00:00+00:00  0.444048  0.415238  1.265000
245 2019-03-11 05:00:00+00:00  0.590698  0.591395  1.156977
246 2019-03-11 06:00:00+00:00  0.497872  0.465319  1.228298

However, there is no data available between 17-Feb-19 and 01-Mar-19 .但是,在2019 年 2 月 17 日和 2019 年 3 月 1 日之间没有可用数据 Hence, I'd like to insert the following data (grouped by day of the week and the hour) into the missing date and times:因此,我想将以下数据(按星期几和小时分组)插入缺失的日期和时间:

                     v_amm     v_alc     v_no2
day_of_week hour                              
0           0     0.432222  0.351111  1.258889
            1     0.371026  0.324359  1.323333
            2     0.371026  0.324359  1.323333
            3     0.250000  0.285000  1.510000
            4     0.220000  0.274500  1.616500
            5     0.195263  0.264211  1.666053
...
6           18    0.919158  0.557793  1.018703
            19    1.065220  0.599320  0.965771
            20    0.896227  0.543689  1.045634
            21    0.648488  0.469210  1.187928
            22    0.592200  0.417200  1.154400
            23    0.485918  0.366531  1.215918

Does anyone know how to obtain this in pandas?有谁知道如何在熊猫中获得这个?

First, generate the missing indexes then concatenate dataframes.首先,生成缺失的索引,然后连接数据帧。

new_index = pd.date_range(start='2019-02-17', end='2019-03-01', freq='H')
new_df = pd.DataFrame([new_index], index=['t']).T
new_df['day_of_week'] = [z.weekday() for z in new_index]
new_df['hour'] = [z.hour for z in new_index]
new_df = new_df.merge(<your_df>, on=['day_of_week', 'hour']), how='left')
new_df = new_df.drop(['day_of_week', 'hour'], axis=1)
filled_df = pd.concat([<df1>, new_df, <df2>], axis=0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM