使用正則表達式在不同列的熊貓數據框中查找單詞並創建新值

Question

假設我有一個包含以下內容的數據框：

df = pd.DataFrame({'Name':['John', 'Alice', 'Peter', 'Sue'],
                   'Job': ['Dentist', 'Blogger', 'Cook', 'Cook'], 
                  'Sector': ['Health', 'Entertainment', '', '']})

我想找到所有“廚師”，無論是否為大寫字母，並將它們分配給名為“美食”的值的“部門”列，我該怎么做？ 並且不覆蓋“部門”列中的其他條目？ 謝謝！

Answer 1

這是一種方法：

df.loc[df.Job.str.lower().eq('cook'), 'Sector'] = 'gastronomy'

print(df)

    Name      Job         Sector
0   John  Dentist         Health
1  Alice  Blogger  Entertainment
2  Peter     Cook     gastronomy
3    Sue     Cook     gastronomy

Answer 2

使用Series.str.match與regex和正則表達式標志不區分大小寫（ ?i ）：

df.loc[df['Job'].str.match('(?i)cook'), 'Sector'] = 'gastronomy'

輸出


    Name      Job         Sector
0  John   Dentist  Health       
1  Alice  Blogger  Entertainment
2  Peter  Cook     gastronomy   
3  Sue    Cook     gastronomy

使用正則表達式在不同列的熊貓數據框中查找單詞並創建新值

問題描述

2 個解決方案

解決方案1
4 2020-01-14 15:46:38

解決方案2
2 2020-01-14 15:48:57

使用正則表達式在不同列的熊貓數據框中查找單詞並創建新值

問題描述

2 個解決方案

解決方案1 4 2020-01-14 15:46:38

解決方案2 2 2020-01-14 15:48:57

解決方案1
4 2020-01-14 15:46:38

解決方案2
2 2020-01-14 15:48:57