簡體   English   中英

如何使用 pandas 在 DataFrame 中從一列中查找並提取記錄的天/周/月的最后讀數到新的讀數?

[英]How to find and extract the last reading of recorded days/weeks/months from one column to new ones in a DataFrame using pandas?

在大型 DataFrame 中,我在一列中有讀數,在另一列中有讀數的“DateTime”格式的本地日期和時間。 我想在同一 DateFrame 中生成新列,這些新列僅包含單獨列中記錄的天、周或月的最后讀數。 能夠選擇一周的最后一天也很重要。 這是我擁有的數據示例(在這里,我選擇星期三作為一周的最后一天):

       local_date_time    weekday  readings
0  2022-04-29 17:03:25     Friday       468
1  2022-04-29 23:42:06     Friday       638
2  2022-04-30 00:22:06   Saturday       649
3  2022-04-30 16:42:07   Saturday       650
4  2022-04-30 19:42:06   Saturday       641
5  2022-04-30 23:59:06   Saturday      1301
6  2022-05-01 00:42:07     Sunday      1240
7  2022-05-01 04:12:07     Sunday       927
8  2022-05-01 09:52:07     Sunday       810
9  2022-05-01 16:42:07     Sunday      1024
10 2022-05-01 23:52:07     Sunday       551
11 2022-05-02 09:02:07     Monday       534
12 2022-05-02 13:42:07     Monday       684
13 2022-05-02 22:32:08     Monday       952
14 2022-05-02 23:59:07     Monday       628
15 2022-05-03 00:02:07    Tuesday       640
16 2022-05-03 06:12:08    Tuesday       762
17 2022-05-03 11:22:08    Tuesday       707
18 2022-05-03 14:12:08    Tuesday       623
19 2022-05-03 21:02:08    Tuesday       713
20 2022-05-03 23:42:08    Tuesday       606
21 2022-05-04 01:02:09  Wednesday       565
22 2022-05-04 05:32:09  Wednesday       495
23 2022-05-04 20:22:09  Wednesday       565
24 2022-05-04 23:59:09  Wednesday       693
25 2022-05-05 00:02:09   Thursday       723
26 2022-05-05 04:12:08   Thursday       534
27 2022-05-05 10:22:09   Thursday       464
28 2022-05-05 15:42:09   Thursday       479
29 2022-05-05 23:59:09   Thursday       478

為此,我嘗試使用條件“df.loc”函數來解決問題。 這是我寫的代碼:

df['Day'] = df['local_date_time'].dt.day    
df['Hour'] = df['local_date_time'].dt.hour
df['Min'] = df['local_date_time'].dt.minute
df.loc[(df['Hour'] == 23) & (df['Min'] >= 59),'end_day'] = df['readings']
df.loc[(df['Hour'] == 23) & (df['Min'] >= 59) & (df['Weekday'] == 'Wednesday'),'end_week'] = df['readings']
df.loc[(df['Hour'] == 23) & (df['Min'] >= 59) & (df['Day'] == 30),'end_month'] = df['readings']

在我每天 23:59 閱讀之前,代碼運行良好。 但是,如果每天 23:59 不存在讀取,則代碼不起作用。 另外,在這種方法中,我可以只選擇一個特定日期(例如示例中的 30 日)作為該月的最后一天,這不適用於其他天數更多或更少的月份。 這是我希望看到的結果。

 local_date_time weekday readings end_day end_week end_month 2022-04-29 17:03:25 Friday 468 2022-04-29 23:42:06 Friday 638 638 2022-04-30 00:22:06 Saturday 649 2022-04-30 16:42:07 Saturday 650 2022-04-30 19:42:06 Saturday 641 2022-04-30 23:59:06 Saturday 1301 1301 1301 2022-05-01 00:42:07 Sunday 1240 2022-05-01 04:12:07 Sunday 927 2022-05-01 09:52:07 Sunday 810 2022-05-01 16:42:07 Sunday 1024 2022-05-01 23:52:07 Sunday 551 551 2022-05-02 09:02:07 Monday 534 2022-05-02 13:42:07 Monday 684 2022-05-02 22:32:08 Monday 952 2022-05-02 23:59:07 Monday 628 628 2022-05-03 00:02:07 Tuesday 640 2022-05-03 06:12:08 Tuesday 762 2022-05-03 11:22:08 Tuesday 707 2022-05-03 14:12:08 Tuesday 623 2022-05-03 21:02:08 Tuesday 713 2022-05-03 23:42:08 Tuesday 606 606 2022-05-04 01:02:09 Wednesday 565 2022-05-04 05:32:09 Wednesday 495 2022-05-04 20:22:09 Wednesday 565 2022-05-04 23:59:09 Wednesday 693 693 693 2022-05-05 00:02:09 Thursday 723 2022-05-05 04:12:08 Thursday 534 2022-05-05 10:22:09 Thursday 464 2022-05-05 15:42:09 Thursday 479 2022-05-05 23:59:09 Thursday 478 478

在這里使用適當的 DatetimeIndex 重新采樣將很有用。

# Make it a datetime index:
df.local_date_time = pd.to_datetime(df.local_date_time)
df = df.set_index('local_date_time')

# Do the resampling:
    # End of Day:
df['end_day'] = df.resample('D')['readings'].transform(lambda x: x.tail(1))
    # Weekly, Wednesdays:
df['end_week'] = df.resample('W-Wed')['readings'].transform(lambda x: x.tail(1))
    # End of Month:
df['end_month'] = df.resample('M')['readings'].transform(lambda x: x.tail(1))

# Some corrections for the very end:
    # Clear non-wednesdays:
df['end_week'] = df['end_week'].where(df.index.to_series().dt.weekday.eq(2), np.nan)
    # clear non-end-of-months:
df['end_month'] = df['end_month'].where(df.index.to_series().dt.is_month_end, np.nan)

輸出:

                       weekday  readings  end_day  end_week  end_month
local_date_time
2022-04-29 17:03:25     Friday       468      NaN       NaN        NaN
2022-04-29 23:42:06     Friday       638    638.0       NaN        NaN
2022-04-30 00:22:06   Saturday       649      NaN       NaN        NaN
2022-04-30 16:42:07   Saturday       650      NaN       NaN        NaN
2022-04-30 19:42:06   Saturday       641      NaN       NaN        NaN
2022-04-30 23:59:06   Saturday      1301   1301.0       NaN     1301.0
2022-05-01 00:42:07     Sunday      1240      NaN       NaN        NaN
2022-05-01 04:12:07     Sunday       927      NaN       NaN        NaN
2022-05-01 09:52:07     Sunday       810      NaN       NaN        NaN
2022-05-01 16:42:07     Sunday      1024      NaN       NaN        NaN
2022-05-01 23:52:07     Sunday       551    551.0       NaN        NaN
2022-05-02 09:02:07     Monday       534      NaN       NaN        NaN
2022-05-02 13:42:07     Monday       684      NaN       NaN        NaN
2022-05-02 22:32:08     Monday       952      NaN       NaN        NaN
2022-05-02 23:59:07     Monday       628    628.0       NaN        NaN
2022-05-03 00:02:07    Tuesday       640      NaN       NaN        NaN
2022-05-03 06:12:08    Tuesday       762      NaN       NaN        NaN
2022-05-03 11:22:08    Tuesday       707      NaN       NaN        NaN
2022-05-03 14:12:08    Tuesday       623      NaN       NaN        NaN
2022-05-03 21:02:08    Tuesday       713      NaN       NaN        NaN
2022-05-03 23:42:08    Tuesday       606    606.0       NaN        NaN
2022-05-04 01:02:09  Wednesday       565      NaN       NaN        NaN
2022-05-04 05:32:09  Wednesday       495      NaN       NaN        NaN
2022-05-04 20:22:09  Wednesday       565      NaN       NaN        NaN
2022-05-04 23:59:09  Wednesday       693    693.0     693.0        NaN
2022-05-05 00:02:09   Thursday       723      NaN       NaN        NaN
2022-05-05 04:12:08   Thursday       534      NaN       NaN        NaN
2022-05-05 10:22:09   Thursday       464      NaN       NaN        NaN
2022-05-05 15:42:09   Thursday       479      NaN       NaN        NaN
2022-05-05 23:59:09   Thursday       478    478.0       NaN        NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM