簡體   English   中英

過濾 pandas dataframe 列並使用列表條件替換值

[英]Filter pandas dataframe column and replace values using a list condition

我有以下 dataframe:

ID  Type                Job
1   Employee            Doctor
2   Contingent Worker   Doctor
3   Employee            Employee
4   Employee            Employee
5   Contingent Worker   Employee
6   Contingent Worker   Consultant
7   Contingent Worker   Trainee
8   Contingent Worker   SSS
9   Contingent Worker   Agency Worker
10  Contingent Worker   

對於擁有臨時工類型的每個人,我都有這個可能的可接受值列表:

list = ['Agency Worker', 'Consultant']

我需要找到一種方法來確認“臨時工”類型下的每個人在“工作”中是否有一個可接受的值,如果沒有(或空白值),則將“顧問”的值替換為導致此 dataframe:

ID  Type                Job
1   Employee            Doctor
2   Contingent Worker   Consultant
3   Employee            Employee
4   Employee            Employee
5   Contingent Worker   Consultant
6   Contingent Worker   Consultant
7   Contingent Worker   Consultant
8   Contingent Worker   Consultant
9   Contingent Worker   Agency Worker
10  Contingent Worker   Consultant

實現這一結果的最佳方法是什么?

你可以np.where為此。

  • Select 僅列Job的相關條目的切片(因此: df['Type']=='Contingent Worker'內部df.loc )並使用Series.isin檢查每個字符串是否在您的列表中找到(此處: jobs )。 如果是,我們返回關聯的值,否則,我們返回“顧問”。
jobs = ['Agency Worker', 'Consultant']

df.loc[df['Type']=='Contingent Worker','Job'] = np.where(
    df.loc[df['Type']=='Contingent Worker','Job'].isin(jobs),
    df.loc[df['Type']=='Contingent Worker','Job'], 
    'Consultant')

print(df)

   ID               Type            Job
0   1           Employee         Doctor
1   2  Contingent Worker     Consultant
2   3           Employee       Employee
3   4           Employee       Employee
4   5  Contingent Worker     Consultant
5   6  Contingent Worker     Consultant
6   7  Contingent Worker     Consultant
7   8  Contingent Worker     Consultant
8   9  Contingent Worker  Agency Worker
9  10  Contingent Worker     Consultant

注意 請不要使用“list”作為變量名。 由於list是 Python 中的內置數據類型,這樣做會覆蓋其功能。 例如:

print(type(list))
<class 'type'>

list = ['Agency Worker', 'Consultant']
<class 'list'>

lst = list()
# will now throw an error:
    TypeError: 'list' object is not callable

我會按照以下方式進行

df.loc[(df.Type=='Contingent Worker') & ~df.Job.isin(['Agency Worker', 'Consultant']),'Job'] = 'Consultant'
print(df)

給出 output

                 Type            Job
ID
1            Employee         Doctor
2   Contingent Worker     Consultant
3            Employee       Employee
4            Employee       Employee
5   Contingent Worker     Consultant
6   Contingent Worker     Consultant
7   Contingent Worker     Consultant
8   Contingent Worker     Consultant
9   Contingent Worker  Agency Worker
10  Contingent Worker     Consultant

解釋: select 這樣的行,其中 Type 是Contingent Worker並且 ( & ) Job 不是 ( ~ ) 列表中的值之一 ( isin ),select Job 列,將值設置為Consultant

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM