繁体   English   中英

Pandas - 比较 2 列并根据优先级选择值

[英]Pandas - compare 2 columns and choose value based on Priority

下面是我的输入 dataframe

df = pd.DataFrame({'Level_DB': ['Level 1 Experienced' ,'Level 2 Expert', 'Level 1 Experienced', 'Level 2 Expert', 'Level 3 Thought Leader', 'Level 1 Experienced', 'Non-Certified', 'Level 3 Thought Leader', 'Certified', 'Certified', np.nan, 'Level 1 Experienced'], 
                    'Level_Legacy' :[ 'Certified', 'Level 1 Experienced', 'Level 3 Thought Leader', 'Level 3 Thought Leader Recert', 'Level 3 Thought Leader Recert', 'Non-Certified', 'non-certified', 'Level 2 Expert Recert', 'Level 1 Experienced', 'Non-Certified', 'Certified', '']})

并且,在比较输入列“Level_DB”和“Level_Legacy”并选择最高优先级值后,应生成目标列:“Output”。 优先列表如下

priority_List = ['Level 3 Thought Leader', 'Level 3 Thought Leader New', 'Level 3 Thought Leader Recert', 'Level 3 Thought Leader Recert Lapsed',
                 'Level 2 Expert', 'Level 2 Expert New', 'Level 2 Expert Recert', 'Level 2 Expert Recert Lapsed',
                 'Level 1 Experienced', 'Level 1 Experienced New', 'Level 1 Experienced Recert', 'Level 1 Experienced Recert Lapsed', 'Certified', 'Non-Certified' , 'non-certified']

预期的最终 DataFrame 与所需的“输出”列如下

在此处输入图像描述

一开始没有什么能打动我的大脑。 请帮忙

想法是创建有序分类,通过DataFrame.stack重塑,因此 output 是每个level=0max

from pandas.api.types import CategoricalDtype
cat_type = CategoricalDtype(categories=priority_List[::-1],ordered=True)

#solution if more columns in data
#df['Output'] = df[['Level_DB','Level_Legacy']].stack().astype(cat_type).max(level=0)
df['Output'] = df.stack().astype(cat_type).max(level=0)
print (df)
                  Level_DB                   Level_Legacy  \
0      Level 1 Experienced                      Certified   
1           Level 2 Expert            Level 1 Experienced   
2      Level 1 Experienced         Level 3 Thought Leader   
3           Level 2 Expert  Level 3 Thought Leader Recert   
4   Level 3 Thought Leader  Level 3 Thought Leader Recert   
5      Level 1 Experienced                  Non-Certified   
6            Non-Certified                  non-certified   
7   Level 3 Thought Leader          Level 2 Expert Recert   
8                Certified            Level 1 Experienced   
9                Certified                  Non-Certified   
10                     NaN                      Certified   
11     Level 1 Experienced                                  

                           Output  
0             Level 1 Experienced  
1                  Level 2 Expert  
2          Level 3 Thought Leader  
3   Level 3 Thought Leader Recert  
4          Level 3 Thought Leader  
5             Level 1 Experienced  
6                   Non-Certified  
7          Level 3 Thought Leader  
8             Level 1 Experienced  
9                       Certified  
10                      Certified  
11            Level 1 Experienced  

我们可以在这里使用Series.map通过enumerating您的priority_list并获得顺序最高的最低索引:

dct_priority = {j:i for i, j in enumerate(priority_List)}
dct_priority_reverse = {i:j for i, j in enumerate(priority_List)}

df['Output'] = df.apply(lambda x: x.map(dct_priority)).min(axis=1).map(dct_priority_reverse)
                  Level_DB                   Level_Legacy                         Output
0      Level 1 Experienced                      Certified            Level 1 Experienced
1           Level 2 Expert            Level 1 Experienced                 Level 2 Expert
2      Level 1 Experienced         Level 3 Thought Leader         Level 3 Thought Leader
3           Level 2 Expert  Level 3 Thought Leader Recert  Level 3 Thought Leader Recert
4   Level 3 Thought Leader  Level 3 Thought Leader Recert         Level 3 Thought Leader
5      Level 1 Experienced                  Non-Certified            Level 1 Experienced
6            Non-Certified                  non-certified                  Non-Certified
7   Level 3 Thought Leader          Level 2 Expert Recert         Level 3 Thought Leader
8                Certified            Level 1 Experienced            Level 1 Experienced
9                Certified                  Non-Certified                      Certified
10                     NaN                      Certified                      Certified
11     Level 1 Experienced                                           Level 1 Experienced

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM