简体   繁体   English

根据 A 列的值对 B 列的值应用操作 - Pandas

[英]Apply an operation on the value of column B depending on the value of column A - Pandas

I have the following dataset:我有以下数据集:

id  Wages   PayFreq
0   1013    Weekly
1   5000    Monthly
2   892     Weekly
3   2320    Bi-Weekly
4   1068    Weekly

I intend to perform the following operation:我打算执行以下操作:

if PayFreq == 'Monthly':
   (Wages / 4) * 52
elif PayFreq == 'Bi-Weekly':
   (Wages / 2) * 52
else:
    Wages * 52

I need to select the operation to apply to the wages column based on what is present in the PayFreq column.我需要 select 根据 PayFreq 列中存在的内容将操作应用于工资列。 Any ideas?有任何想法吗?

Use Series.map by dictionary, for not matched values use Series.fillna because map return NaN and multiple by Wages column:按字典使用Series.map ,对于不匹配的值,请使用Series.fillna ,因为 map 返回NaN和按Wages列的倍数:

d = {'Monthly': 52/4, 'Bi-Weekly': 52/2}
df['YearWages'] = df['PayFreq'].map(d).fillna(52).mul(df['Wages'])

print (df)
   id  Wages    PayFreq  YearWages
0   0   1013     Weekly    52676.0
1   1   5000    Monthly    65000.0
2   2    892     Weekly    46384.0
3   3   2320  Bi-Weekly    60320.0
4   4   1068     Weekly    55536.0

Solution with masks passed to numpy.select :将掩码传递给numpy.select的解决方案:

df['NewWages'] = np.select([df['PayFreq'] == 'Weekly',
                            df['PayFreq']== ' Monthly'], 
                            [(df['Wages'] / 2)*52, 
                             (df['Wages'] / 4) * 52], default=df['Wages']*52)
                            
print (df)
   id  Wages    PayFreq  NewWages
0   0   1013     Weekly   26338.0
1   1   5000    Monthly  260000.0
2   2    892     Weekly   23192.0
3   3   2320  Bi-Weekly  120640.0
4   4   1068     Weekly   27768.0

Or you can use np.where , and adjust the conditions accordingly, use the below just as an example not as a full answer :或者您可以使用np.where并相应地调整条件,将以下仅用作示例而不是完整答案

import numpy as np
df['NewWages'] = np.where(df['PayFreq'] == 'Weekly', (df['Wages'] / 2)*52,
                          np.where(df['PayFreq']== ' Monthly', (df['Wages'] / 4) * 52, df['Wages']*52))

which prints:打印:

   id  Wages    PayFreq  NewWages
0   0   1013     Weekly   26338.0
1   1   5000    Monthly  260000.0
2   2    892     Weekly   23192.0
3   3   2320  Bi-Weekly  120640.0
4   4   1068     Weekly   27768.0

I would sugest to use Dataframe's apply as it is very simple and intuitive.我建议使用 Dataframe 的apply ,因为它非常简单直观。

You can define a method you want to apply on the dataframe, you can choose either Lambda expression or explicit function for that.您可以定义要在 dataframe 上应用的方法,您可以为此选择 Lambda 表达式或显式 function。 For example, here is a simple implementation with a function:例如,下面是一个简单的实现,使用 function:

def func(row):
    if row['PayFreq'] == 'Monthly':
        return (row['Wages'] / 4) * 52
    elif row['PayFreq'] == 'Bi-Weekly':
        return (row['Wages'] / 2) * 52
    else:
        return row['Wages'] * 52

And in order to apply it on your dataframe (on the right axis):为了将其应用于您的 dataframe(在右轴上):

df['NewWages'] = df.apply(func, axis=1)

The result:结果:

   Wages    PayFreq  NewWages
0   1013     Weekly   52676.0
1   5000    Monthly   65000.0
2    892     Weekly   46384.0
3   2320  Bi-Weekly   60320.0
4   1068     Weekly   55536.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM