简体   繁体   English

熊猫-如何根据多列的条件创建具有3个输出的列

[英]Pandas - How to create a column with 3 outputs based on conditions on multiple columns

I have a DataFrame df: 我有一个DataFrame df:

def fake_data():
     return{'Name': fake.name(), 
         'Gender': random.choice(sex_list),
         'Address': fake.street_address(), 
         'Nationality': 'Zimbabwean', 
         'Account_Type': random.choice(accounts_list), 
         'Age': random.randint(0, 2), 
         'Education': random.random() > 0.5, 
         'Employment': random.randint(0, 2),
         'Salary': random.randint(0, 2),
         'Employer_Stability': random.random() > 0.5,
         'Consistency': random.random() > 0.5,
         'Balance': random.randint(0, 2),
         'Residential_Status': random.random() > 0.5
      }

I want to create a column Service_Level that is 0 or 1 or 2 depending on the conditions of the columns; 我想根据列的条件创建一个0或1或2的Service_Level列;

columns = ['Age','Education', 'Employment', 'Salary', 'Employer_Stability', 'Consistency', 'Balance', 'Residential_Status']

I have tried creating the ['Service_Level'] = 0 with the following, after reading some answers here; 在阅读了这里的一些答案之后,我尝试使用以下代码创建['Service_Level'] = 0;

df['Service_Level'] = np.where((df['Age']==0)&(df['Education']==False)&(df['Employment']==0)&(df['Salary']==0)&(df['Employer_Stability']==False)&(df['Consistency']==False)&(df['Balance']==0)&(df['Residential_Status']==False),
                               (df['Age'])|(df['Education'])|(df['Employment'])|(df['Salary'])|(df['Employer_Stability'])|(df['Consistency'])|(df['Balance'])|(df['Residential_Status']), 0)

Then this for ['Service_Level'] = 1 然后,这对于['Service_Level'] = 1

df['Service_Level'] = np.where((df['Age']==1)&(df['Education']==True)&(df['Employment']==1)&(df['Salary']==1)&(df['Employer_Stability']==False)&(df['Consistency']==True)&(df['Balance']==1)&(df['Residential_Status']==True),
                               (df['Age'])|(df['Education'])|(df['Employment'])|(df['Salary'])|(df['Employer_Stability'])|(df['Consistency'])|(df['Balance'])|(df['Residential_Status']), 1)

Then this for ['Service_Level'] = 2 然后对于['Service_Level'] = 2

df['Service_Level'] = np.where((df['Age']==2)&(df['Education']==True)&(df['Employment']==2)&(df['Salary']==2)&(df['Employer_Stability']==True)&(df['Consistency']==True)&(df['Balance']==2)&(df['Residential_Status']==True),
                               (df['Age'])|(df['Education'])|(df['Employment'])|(df['Salary'])|(df['Employer_Stability'])|(df['Consistency'])|(df['Balance'])|(df['Residential_Status']), 2)

Unfortunately, I can't figure out how to join these conditions so that I get either 0 or 1 or 2. 不幸的是,我不知道如何加入这些条件,所以我得到0或1或2。

If it works, what happens to the states that do not follow those exact conditions? 如果可行,不遵循这些确切条件的状态会发生什么? I would like then to also produce and output 然后我也想生产和输出

You might need to use slicing in conjunction with np.where (which by the way takes three argument, condition, val1(if condion is true), val2) 您可能需要将切片与np.where结合使用(顺便说一下,这需要三个参数,条件,val1(如果条件为true),val2)

Your first statement 你的第一句话

df['Service_Level'] = np.where(condtion_1, 0, 1)

This will result in df['Service_Level'] with 0s for the rows that met with the first condition and 1 otherwise. 对于符合第一个条件的行,这将导致df ['Service_Level']的值为0,否则为1。

Now you mask the data to get only the rows where service_level is not 0 现在,屏蔽数据以仅获取其中service_level不为0的行

df[df['Service_Level'] !=0] 

On this dataframe you can apply the second condition with 在此数据框上,您可以将第二个条件应用于

np.where(condition_2, 1,2) 

to assign 1 to df['Service_Level'] where the condition is true and assign 2 to rest of the rows. 将1分配给条件为true的df ['Service_Level']并将2分配给其余行。

EDIT: 编辑:

You can use np.where with second condtion inside the first one like this. 您可以在第一个条件中将np.where与第二个条件一起使用,如下所示。

df['Service_Level'] = np.where(cond_1, 0, (np.where(cond_2, 1,2)))

For better readability, you may want to first save the conditions as cond_1 etc and use them in np.where 为了提高可读性,您可能需要先将条件另存为cond_1等,然后在np.where中使用它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用基于 2 列的多个条件在 pandas 中创建新列? - How to use multiple conditions based on 2 columns to create the new column in pandas? Pandas:根据多列条件新建列 - Pandas: Create New Column Based on Conditions of Multiple Columns 如何根据Pandas中条件的现有列创建两列? - How create two columns based on existing column with conditions in Pandas? 如何根据 pandas 中多列的条件替换列中的值 - How to replace values in a column based on conditions from multiple columns in pandas 如何根据多个条件在 pandas df 中创建一个新列? - How to create a new column in a pandas df based on multiple conditions? 如何按多列分组并根据Python中的条件创建新列? - How to group by multiple columns and create a new column based on conditions in Python? 根据不同条件在不同列上创建列 - Create column based on multiple conditions on different columns 我们如何根据条件创建具有多个输出的 pytorch 模型? - How can we create a pytorch model with multiple outputs based on conditions? 根据多个条件在 pandas dataframe 中创建多个 boolean 列 - Create multiple boolean columns in pandas dataframe based on multiple conditions Pandas dataframe - 根据多个条件计算创建多个列 - Pandas dataframe - create multiple columns based on multiple conditions calculations
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM