![](/img/trans.png)
[英]Replace values of a Pandas dataframe's Column based on values of another column
[英]How to replace the values in a column in pandas dataframe based on check from another table
我有一個 dataframe df
import pandas as pd
df = pd.DataFrame({"Cust": ['cst1', 'cst1', 'cst1', 'cst2', 'cst2', 'cst2', 'cst3', 'cst3', 'cst3', 'cst4', 'cst4', 'cst4'],
"act": ['ac1', 'ac2', 'ac3','ac1', 'ac2', 'ac3','ac1', 'ac2', 'ac3','ac1', 'ac2', 'ac3' ],
"rating": ['a', 'b', 'c', 'b', 'b', 'c', 'h', 'i', 'i', 'c', 'c', 'a']})
df_priority = pd.DataFrame({"rating":['a','b', 'c', 'd', 'e', 'f','g','h','i','j','k','l','m','n','o','p','q','r','s'],
"priority":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]})
和另一個具有評級優先級的 dataframe
我的df表看起來像:
Cust act rating
0 cst1 ac1 a
1 cst1 ac2 b
2 cst1 ac3 c
3 cst2 ac1 b
4 cst2 ac2 b
5 cst2 ac3 c
6 cst3 ac1 h
7 cst3 ac2 i
8 cst3 ac3 i
9 cst4 ac1 c
10 cst4 ac2 c
11 cst4 ac3 a
我的 df_priority 表看起來像:
rating priority
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
6 g 7
7 h 8
8 i 9
9 j 10
10 k 11
11 l 12
12 m 13
13 n 14
14 o 15
15 p 16
16 q 17
17 r 18
18 s 19
我需要檢查df表中每個cust的評級值並將其替換為該 cust 的最大優先級評級。
例如,對於 cust = cst1,我應該為所有三個記錄評分為a ,因為 a 的優先級高於 b 和 c。 同樣,它應該為每個客戶提供 go,然后應該檢查優先級表並相應地更新。
我預期的 output 是:
Cust act rating
0 cst1 ac1 a
1 cst1 ac2 a
2 cst1 ac3 a
3 cst2 ac1 b
4 cst2 ac2 b
5 cst2 ac3 b
6 cst3 ac1 h
7 cst3 ac2 h
8 cst3 ac3 h
9 cst4 ac1 a
10 cst4 ac2 a
11 cst4 ac3 a
如何在 Pandas 中執行此操作?
我們可以用transform
idxmim
做map
然后用reindex
重新分配它
df['new']=df.rating.map(dict(zip(df_priority.rating,df_priority.priority)))
df.groupby('Cust').new.transform('idxmin')
0 0
1 0
2 0
3 3
4 3
5 3
6 6
7 6
8 6
9 11
10 11
11 11
Name: new, dtype: int64
df['newcol'] = df.rating.reindex(df.groupby('Cust').new.transform('idxmin')).tolist()
df
Cust act rating new newcol
0 cst1 ac1 a 1 a
1 cst1 ac2 b 2 a
2 cst1 ac3 c 3 a
3 cst2 ac1 b 2 b
4 cst2 ac2 b 2 b
5 cst2 ac3 c 3 b
6 cst3 ac1 h 8 h
7 cst3 ac2 i 9 h
8 cst3 ac3 i 9 h
9 cst4 ac1 c 3 a
10 cst4 ac2 c 3 a
11 cst4 ac3 a 1 a
讓我們嘗試將評級映射到它的優先級,然后使用idxmin
進行groupby
查找最高優先級,最后分配回:
idx=(df['rating'].map(df_priority.set_index('rating')['priority'])
.groupby(df['Cust']).transform('idxmin')
)
df['rating'] = df.loc[idx,'rating'].values
Output:
Cust act rating
0 cst1 ac1 a
1 cst1 ac2 a
2 cst1 ac3 a
3 cst2 ac1 b
4 cst2 ac2 b
5 cst2 ac3 b
6 cst3 ac1 h
7 cst3 ac2 h
8 cst3 ac3 h
9 cst4 ac1 a
10 cst4 ac2 a
11 cst4 ac3 a
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.