簡體   English   中英

Date_Range 參數

[英]Date_Range parameters

我想將 date_range 應用於作為日期時間的 dataframe 的索引,我希望根據持續時間值將小時、天、月添加到所述索引中。

例如:原裝 Dataframe

  Date_Out                Hour_Duration
2020-04-10 06:19:45            3
2020-04-19 20:05:50            6
2020-04-30 22:50:00            4

例如:Dataframe 所需結果

 Date_Out                Hour_Duration
2020-04-10 06:19:45            3
2020-04-19 07:19:45            3
2020-04-19 08:19:45            3
2020-04-19 20:05:50            6
2020-04-19 21:05:50            6
2020-04-19 22:05:50            6
2020-04-19 23:05:50            6
2020-04-20 00:05:50            6
2020-04-20 01:05:50            6
2020-04-30 22:50:00            4
2020-04-30 23:50:00            4
2020-05-01 00:50:00            4
2020-05-01 01:50:00            4

您推薦什么解決方案? 可以在 date_range 的“周期”參數中應用 function 嗎?

更新:

原裝Dataframe(名稱Dataframe:游記)

        Date    Actual Departure Date    Arrival Date     DurationHour  DHour
0   2020-04-28  2020-04-28 12:26:39 2020-04-28 16:24:00 0 days 03:57:21   3
1   2020-04-20  2020-04-20 07:53:22 2020-04-21 05:30:00 0 days 21:36:38   6
2   2020-05-28  2020-05-28 15:54:22 2020-05-29 08:17:00 0 days 16:22:38   2
3   2020-05-29  2020-05-29 22:57:05 2020-05-30 01:21:00 0 days 02:23:55   5
4   2020-05-25  2020-05-25 07:22:41 2020-05-30 13:47:00 5 days 06:24:19   1

travels.dtypes

Date                      datetime64[ns]
Actual Departure Date     datetime64[ns]
Arrival Date              datetime64[ns]
DurationHour              timedelta64[ns]
DHour                           int64

預期結果

結果反映在Actual Departure Date列中,其中 Actual Departure Actual Departure Date列中的小時單位總和DHour列的值(重復並增加一個小時

     Date   Actual Departure Date    Arrival Date         DurationHour    DHour
0   2020-04-28  2020-04-28 12:26:39 2020-04-28 16:24:00 0 days 03:57:21     3
0   2020-04-28  2020-04-28 13:26:39 2020-04-28 16:24:00 0 days 03:57:21     3
0   2020-04-28  2020-04-28 14:26:39 2020-04-28 16:24:00 0 days 03:57:21     3
0   2020-04-28  2020-04-28 15:26:39 2020-04-28 16:24:00 0 days 03:57:21     3
1   2020-04-20  2020-04-20 07:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
1   2020-04-20  2020-04-20 08:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
1   2020-04-20  2020-04-20 09:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
1   2020-04-20  2020-04-20 10:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
1   2020-04-20  2020-04-20 11:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
1   2020-04-20  2020-04-20 12:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
1   2020-04-20  2020-04-20 13:53:22 2020-04-21 05:30:00 0 days 21:36:38     6
2   2020-05-28  2020-05-28 15:54:22 2020-05-29 08:17:00 0 days 16:22:38     2
2   2020-05-28  2020-05-28 16:54:22 2020-05-29 08:17:00 0 days 16:22:38     2
2   2020-05-28  2020-05-28 16:54:22 2020-05-29 08:17:00 0 days 16:22:38     2
3   2020-05-29  2020-05-29 23:57:05 2020-05-30 01:21:00 0 days 02:23:55     5
3   2020-05-29  2020-05-30 00:57:05 2020-05-30 01:21:00 0 days 02:23:55     5
3   2020-05-29  2020-05-30 01:57:05 2020-05-30 01:21:00 0 days 02:23:55     5
3   2020-05-29  2020-05-30 02:57:05 2020-05-30 01:21:00 0 days 02:23:55     5
3   2020-05-29  2020-05-30 03:57:05 2020-05-30 01:21:00 0 days 02:23:55     5
3   2020-05-29  2020-05-30 04:57:05 2020-05-30 01:21:00 0 days 02:23:55     5
4   2020-05-25  2020-05-25 07:22:41 2020-05-30 13:47:00 5 days 06:24:19     1
4   2020-05-25  2020-05-25 08:22:41 2020-05-30 13:47:00 5 days 06:24:19     1

我正在嘗試以下方法: travels.loc[np.repeat(travels.index.values, abs(travels['DHour']))]並且它重復正確,但我沒有在日期和時間達到所需的總和Actual Departure Date

您可以使用列表理解和 pd.concat 來做到這一點:

df = df.set_index('Date_Out')
pd.concat(
    [
        df.reindex(
            pd.date_range(idx, periods=row["Hour_Duration"], freq="H"),
            fill_value=row["Hour_Duration"],
        )
        for idx, row in df.iterrows()
    ]
)

Output:

                     Hour_Duration
2020-04-10 06:19:45              3
2020-04-10 07:19:45              3
2020-04-10 08:19:45              3
2020-04-19 20:05:50              6
2020-04-19 21:05:50              6
2020-04-19 22:05:50              6
2020-04-19 23:05:50              6
2020-04-20 00:05:50              6
2020-04-20 01:05:50              6
2020-04-30 22:50:00              4
2020-04-30 23:50:00              4
2020-05-01 00:50:00              4
2020-05-01 01:50:00              4

更新新數據:

import pandas as pd
import numpy as np
from io import StringIO

input_text = StringIO("""        Date    Actual Departure Date    Arrival Date     DurationHour    DHour
0   2020-04-28  2020-04-28 12:26:39  2020-04-28 16:24:00  0 days 03:57:21   3
1   2020-04-20  2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38   6
2   2020-05-28  2020-05-28 15:54:22  2020-05-29 08:17:00  0 days 16:22:38   2
3   2020-05-29  2020-05-29 22:57:05  2020-05-30 01:21:00  0 days 02:23:55   5
4   2020-05-25  2020-05-25 07:22:41  2020-05-30 13:47:00  5 days 06:24:19   1""")

df = pd.read_csv(input_text, sep= '\s\s+', engine='python')

df['Date'] = pd.to_datetime(df['Date'])

df = df.set_index('Date')
df_out = pd.concat(
            [
                df.reindex(
                    pd.date_range(idx, periods=row["DHour"], freq="H"),
                )
                for idx, row in df.iterrows()
             ]
         ).ffill()

Output:

                    Actual Departure Date         Arrival Date     DurationHour  DHour
2020-04-28 00:00:00   2020-04-28 12:26:39  2020-04-28 16:24:00  0 days 03:57:21    3.0
2020-04-28 01:00:00   2020-04-28 12:26:39  2020-04-28 16:24:00  0 days 03:57:21    3.0
2020-04-28 02:00:00   2020-04-28 12:26:39  2020-04-28 16:24:00  0 days 03:57:21    3.0
2020-04-20 00:00:00   2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38    6.0
2020-04-20 01:00:00   2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38    6.0
2020-04-20 02:00:00   2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38    6.0
2020-04-20 03:00:00   2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38    6.0
2020-04-20 04:00:00   2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38    6.0
2020-04-20 05:00:00   2020-04-20 07:53:22  2020-04-21 05:30:00  0 days 21:36:38    6.0
2020-05-28 00:00:00   2020-05-28 15:54:22  2020-05-29 08:17:00  0 days 16:22:38    2.0
2020-05-28 01:00:00   2020-05-28 15:54:22  2020-05-29 08:17:00  0 days 16:22:38    2.0
2020-05-29 00:00:00   2020-05-29 22:57:05  2020-05-30 01:21:00  0 days 02:23:55    5.0
2020-05-29 01:00:00   2020-05-29 22:57:05  2020-05-30 01:21:00  0 days 02:23:55    5.0
2020-05-29 02:00:00   2020-05-29 22:57:05  2020-05-30 01:21:00  0 days 02:23:55    5.0
2020-05-29 03:00:00   2020-05-29 22:57:05  2020-05-30 01:21:00  0 days 02:23:55    5.0
2020-05-29 04:00:00   2020-05-29 22:57:05  2020-05-30 01:21:00  0 days 02:23:55    5.0
2020-05-25 00:00:00   2020-05-25 07:22:41  2020-05-30 13:47:00  5 days 06:24:19    1.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM