unstack date/hour dataframe into single column with datetime index - python, pandas

Question

I have a dataframe like:

                  0       1       2       3       4       5       6       7       8       9       10    11       12      13      14      15      16      17      18      19      20      21      22      23
    16.01.2018  25.45   24.99   24.68   25.00   26.19   28.96   35.78   44.66   41.75   41.58   41.48   41.66   40.66   40.39   40.33   40.73   41.58   45.06   45.84   42.69   39.56   35.4    33.27   29.49
    17.01.2018  28.78   27.71   26.55   25.76   25.97   26.97   30.89   36.06   41.24   40.67   39.86   39.42   38.17   37.31   36.58   36.78   37.8    40.78   40.8    38.95   34.34   31.95   31.56   29.26

where the index is the date a certain value has happened, while the column (from 0 to 23) indicates the hour. I would like to unstack the dataframe in order to have a datetime index and a single column with the respective value:

    16.01.2018 00:00:00  25.45
    16.01.2018 01:00:00  24.99
    16.01.2018 02:00:00  25.68
    16.01.2018 03:00:00  25.00
....

At the moment I am doing:

index = pd.date_range(start = df.index[0], periods=len(df.unstack()), freq='H')
new_df = pd.DataFrame(index=index)
for d in new_df.index.date:
    for h in new_df.index.hour:
        new_df['value'] = df.unstack()[h][d]

but the for loop is taking ages...do you have a better (faster) solution?

Answer 1

Convert index to DatetimeIndex and columns to timedelta s, so after reshape by DataFrame.stack and Series.reset_index only sum both new columns:

df.index = pd.to_datetime(df.index)
df.columns = pd.to_timedelta(df.columns + ':00:00')
df = df.stack().reset_index(name='data')
df.index = df.pop('level_0') + df.pop('level_1')
print (df.head())
                      data
2018-01-16 00:00:00  25.45
2018-01-16 01:00:00  24.99
2018-01-16 02:00:00  24.68
2018-01-16 03:00:00  25.00
2018-01-16 04:00:00  26.19

Soluton with unstack is similar, only output ordering is different:

df.index = pd.to_datetime(df.index)
df.columns = pd.to_timedelta(df.columns + ':00:00')
df = df.unstack().reset_index(name='data')
df.index = df.pop('level_1') + df.pop('level_0')
print (df.head())
                      data
2018-01-16 00:00:00  25.45
2018-01-17 00:00:00  28.78
2018-01-16 01:00:00  24.99
2018-01-17 01:00:00  27.71
2018-01-16 02:00:00  24.68

unstack date/hour dataframe into single column with datetime index - python, pandas

Question

1 answers

solution1
1 ACCPTED 2020-01-27 08:47:34

unstack date/hour dataframe into single column with datetime index - python, pandas

Question

1 answers

solution1 1 ACCPTED 2020-01-27 08:47:34

solution1
1 ACCPTED 2020-01-27 08:47:34