简体   繁体   中英

Fill the obtained day classification in its hour interval? Pandas Python

I am trying to fill an hour interval DataFrame with a whole Day classification, you can copy/paste the code, it should run:

import pandas as pd
from datetime import timedelta, date

column2 = [1, 2, 3, 4, 7, 8, 9, 10]
column1 = [item for item in range(1, 74)]
column3 = pd.date_range('1998-01-01 00:00', freq='h', periods=73, tz ='Etc/GMT+0' )
column4 = ['1998-01-01 00:00:00', '1998-01-01 01:00:00', '1998-01-01 02:00:00', '1998-01-01 03:00:00 ', 
          '1998-01-01 06:00:00', '1998-01-01 07:00:00', '1998-01-01 08:00:00', '1998-01-01 09:00:00']
column5 = ['1998-01-01', '1998-01-02', '1998-01-03']
column6 = ['Overcast', 'Clear', 'High']

dtst_1 = pd.DataFrame()
dtst_1['column1'] = column1
dtst_1.set_index(column3, inplace=True)

dtst_2 = pd.DataFrame()
dtst_2['column2'] = column2
dtst_2['column4'] = column4
dtst_2['column4'] = pd.to_datetime(dtst_2['column4'])
dtst_2.set_index('column4', inplace=True)

dtst_3 = pd.DataFrame()
dtst_3['column6'] = column6
dtst_3['column5'] = column5
dtst_3['column5'] = pd.to_datetime(dtst_3['column5'])
dtst_3.set_index('column5', inplace=True)


dtst_2.index = pd.to_datetime(dtst_2.index).tz_localize('Etc/GMT+0')
dtst_3.index = pd.to_datetime(dtst_3.index).tz_localize('Etc/GMT+0')
dtst_2 = dtst_2.merge(dtst_1['colum1'], how = 'right', left_index=True, right_index=True)

def daterange_tst(start_date_tst, end_date_tst):
    for n in range(int ((end_date_tst - start_date_tst).days)):
        yield start_date_tst + timedelta(n)

start_date_tst = date(1998, 1, 1)
end_date_tst = date(1998, 1, 2)

for single_date_tst in daterange_tst(start_date_tst, end_date_tst):
    print(single_date_tst)
    dtst_2 = dtst_2.join(dtst_3['column6'], how = 'outer')

dtst_2.head(49)

And you should see this result:

dataframe

Is there any way to fill the NaN gaps in colum6 with the day classification? (day 1 fill with Overcast, day 2 fill with Clear... etc...? Assuming, of course, that this is just a small section of a huge dataset, so is there any way to insert the classified day into the intra-hour range of that day? Thank you so much.

Is this what you are trying to do?

dtst_2['column6'].ffill(inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM