简体   繁体   English

字符串变量的多个条件

[英]Multiple conditions for string variable

I am trying to add a new column "profile_type" to a dataframe "df_new" which contains the string "Decision Maker" if the "job_title" has any one of the following words: (Head or VP or COO or CEO or CMO or CLO or Chief or Partner or Founder or Owner or CIO or CTO or President or Leaders),我正在尝试将新列“profile_type”添加到 dataframe“df_new”,其中包含字符串“Decision Maker” ,如果“job_title”具有以下任何一个词:(Head or VP or COO or CEO or CMO or CLO或首席或合伙人或创始人或所有者或首席信息官或首席技术官或总裁或领导者),

"Key Influencer" if the "job_title" has any one of the following words: (Senior or Consultant or Manager or Learning or Training or Talent or HR or Human Resources or Consultant or L&D or Lead), and “关键影响者” ,如果“职位”包含以下任一词语:(高级或顾问或经理或学习或培训或人才或人力资源或人力资源或顾问或 L&D 或领导),以及

"Influencer" for all other fields in "job_title". job_title”中所有其他字段的“影响者”。

For example, if the 'job_title' includes a row "Learning and Development Specialist", the code has to pull out just the word 'Learning' and segregate it as 'Key Influencer' under 'profile_type'.例如,如果“job_title”包含一行“Learning and Development Specialist”,则代码必须仅提取“Learning”一词并将其作为“profile_type”下的“Key Influencer”分离。

I would try something like this:我会尝试这样的事情:

import numpy as np

dm_titles = ['Head', 'VP', 'COO', ...]
ki_titles = ['Senior ', 'Consultant', 'Manager', ...]


conditions = [
(any([word in  new_df['job_title'] for word in dm_titles])),
(any([word in  new_df['job_title'] for word in ki_titles])),
(all([word not in  new_df['job_title'] for word in dm_titles] + [word not in  new_df['job_title'] for word in ki_titles]))
]

values = ["Decision Maker", "Key Influencer", "Influencer"]

df_new['profile_type'] = np.select(conditions, values)

Let me know if you need any clarification!如果您需要任何说明,请告诉我!

The below code worked for me.下面的代码对我有用。

import re
s1 = pd.Series(df['job_title'])

condition1 = s1.str.contains('Director|Head|VP|COO|CEO...', flags=re.IGNORECASE, regex=True)

condition2 = s1.str.contains('Senior|Consultant|Manager|Learning...', flags=re.IGNORECASE, regex=True)

df_new['profile_type'] = np.where(condition1 == True, 'Decision Maker', 
         (np.where(condition2 == True, 'Key Influencer', 'Influencer')))

First, define a function that acts on a row of the dataframe, and returns what you want: in your case, 'Decision Maker' if the job_title contains any words in your list.首先,定义一个 function,它作用于 dataframe 的一行,并返回您想要的内容:在您的情况下,如果job_title包含列表中的任何单词,则为'Decision Maker'

def is_key_worker(row):
    if (row["job_title"] == "CTO" or row["job_title"]=="Founder") # add more here.

Next, apply the function to your dataframe, along axis 1.接下来,沿轴 1 将 function 应用于 dataframe。

df_new["Key influencer"] = df_new.apply(is_key_worker, axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM