[英]Extracting values from a df
問題陳述:
每輛車有多個充電和放電實例,獲取特定日期每輛車的最小充電量、最大充電量、最小放電量和最大放電量。
df1
Date Time vehicle_no soc SOC Diff
0 2022-10-01 02:27:56 DL21GD0100 80.0 0
1 2022-10-01 02:28:26 DL21GD0100 80.0 Discharging
2 2022-10-01 02:28:56 DL21GD0100 80.0 Discharging
3 2022-10-01 02:29:26 DL21GD0100 80.0 Discharging
4 2022-10-01 02:29:56 DL21GD0100 69.0 Discharging
5 2022-10-01 02:29:56 DL21GD0100 70.0 Charging
6 2022-10-01 02:29:56 DL21GD0100 71.0 Charging
7 2022-10-01 02:29:56 DL21GD0100 72.0 Charging
8 2022-10-01 03:16:00 DL21GD0100 63.0 Discharging
9 2022-10-01 03:16:30 DL21GD0100 23.0 Discharging
10 2022-10-01 04:17:00 DL21GD0100 54.0 Charging
11 2022-10-01 09:17:30 WB25M9298 24.0 Charging
12 2022-10-01 09:18:00 WB25M9298 25.0 Charging
閱讀 3 個不同選項的完整答案
您可以使用groupby.diff
來獲得每組的差異,然后numpy.sign
和map
:
df['status'] = np.sign(df.groupby('vehicle_no')['soc'].diff()
).map({1: 'Charging', -1: 'Discharging'})
或者使用numpy.select
:
s = df.groupby('vehicle_no')['soc'].diff()
df['status'] = np.select([s>0, s<0], ['Charging', 'Discharging'], np.nan)
Output:
Date Time vehicle_no soc status
0 2022-10-01 02:27:56 DL21GD0100 80.0 NaN
2 2022-10-01 02:28:56 DL21GD0100 80.0 NaN
3 2022-10-01 02:29:26 DL21GD0100 80.0 NaN
4 2022-10-01 02:29:56 DL21GD0100 69.0 Discharging
5 2022-10-01 02:29:56 DL21GD0100 70.0 Charging
6 2022-10-01 02:29:56 DL21GD0100 71.0 Charging
7 2022-10-01 02:29:56 DL21GD0100 72.0 Charging
8 2022-10-01 09:16:00 WB25M9298 23.0 NaN
9 2022-10-01 09:16:30 WB25M9298 23.0 NaN
10 2022-10-01 09:17:00 WB25M9298 24.0 Charging
11 2022-10-01 09:17:30 WB25M9298 24.0 NaN
12 2022-10-01 09:18:00 WB25M9298 25.0 Charging
如果您想將等值視為放電:
df['status'] = np.where(df.groupby('vehicle_no')['soc'].diff().gt(0), 'Charging', 'Discharging')
Output:
Date Time vehicle_no soc status
0 2022-10-01 02:27:56 DL21GD0100 80.0 Discharging
2 2022-10-01 02:28:56 DL21GD0100 80.0 Discharging
3 2022-10-01 02:29:26 DL21GD0100 80.0 Discharging
4 2022-10-01 02:29:56 DL21GD0100 69.0 Discharging
5 2022-10-01 02:29:56 DL21GD0100 70.0 Charging
6 2022-10-01 02:29:56 DL21GD0100 71.0 Charging
7 2022-10-01 02:29:56 DL21GD0100 72.0 Charging
8 2022-10-01 09:16:00 WB25M9298 23.0 Discharging
9 2022-10-01 09:16:30 WB25M9298 23.0 Discharging
10 2022-10-01 09:17:00 WB25M9298 24.0 Charging
11 2022-10-01 09:17:30 WB25M9298 24.0 Discharging
12 2022-10-01 09:18:00 WB25M9298 25.0 Charging
d = {1: 'Charging', -1: 'Discharging'}
df['status'] = (df.groupby('vehicle_no')['soc']
.transform(lambda s: np.sign(s.diff()).map(d).ffill())
.fillna('Discharging')
)
Output:
Date Time vehicle_no soc status
0 2022-10-01 02:27:56 DL21GD0100 80.0 Discharging
2 2022-10-01 02:28:56 DL21GD0100 80.0 Discharging
3 2022-10-01 02:29:26 DL21GD0100 80.0 Discharging
4 2022-10-01 02:29:56 DL21GD0100 69.0 Discharging
5 2022-10-01 02:29:56 DL21GD0100 70.0 Charging
6 2022-10-01 02:29:56 DL21GD0100 71.0 Charging
7 2022-10-01 02:29:56 DL21GD0100 72.0 Charging
8 2022-10-01 09:16:00 WB25M9298 23.0 Discharging
9 2022-10-01 09:16:30 WB25M9298 23.0 Discharging
10 2022-10-01 09:17:00 WB25M9298 24.0 Charging
11 2022-10-01 09:17:30 WB25M9298 24.0 Charging
12 2022-10-01 09:18:00 WB25M9298 25.0 Charging
嘗試這個-
df1.groupby(['vehicle_no','status']).agg({'soc':[min,max]})
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.