简体   繁体   English

使用 lambda 和 pandas 计算以现有列为条件的新列

[英]Use lambda with pandas to calculate a new column conditional on existing column

I need to create a new column in a pandas DataFrame which is calculated as the ratio of 2 existing columns in the DataFrame. However, the denominator in the ratio calculation will change based on the value of a string which is found in another column in the DataFrame.我需要在 pandas DataFrame 中创建一个新列,它计算为 DataFrame 中 2 个现有列的比率。但是,比率计算中的分母将根据在另一列中找到的字符串的值而变化DataFrame。

Example.例子。 Sample dataset:示例数据集:

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
                        'exp_force' : [25,28,82,84], 
                        'left_max'  : [38,38,38,38], 
                        'both_max'  : [90,90,90,90]})

I need to create a new DataFrame column df['ratio'] based on the condition of df['hand'] .我需要根据df['hand']的条件创建一个新的 DataFrame 列df['ratio'] ] 。

If df['hand']=='left' then df['ratio'] = df['exp_force'] / df['left_max']如果df['hand']=='left'那么df['ratio'] = df['exp_force'] / df['left_max']

If df['hand']=='both' then df['ratio'] = df['exp_force'] / df['both_max']如果df['hand']=='both'那么df['ratio'] = df['exp_force'] / df['both_max']

You can use np.where() :您可以使用np.where()

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
                        'exp_force' : [25,28,82,84], 
                        'left_max'  : [38,38,38,38], 
                        'both_max'  : [90,90,90,90]})
df['ratio'] = np.where((df['hand']=='left'), df['exp_force'] / df['left_max'], df['exp_force'] / df['both_max'])
df

Out[42]: 
   hand  exp_force  left_max  both_max     ratio
0  left         25        38        90  0.657895
1  left         28        38        90  0.736842
2  both         82        38        90  0.911111
3  both         84        38        90  0.933333

Alternatively, in a real-life scenario, if you have lots of conditions and results, then you can use np.select() , so that you don't have to keep repeating your np.where() statement as I have done a lot in my older code.或者,在现实生活中,如果你有很多条件和结果,那么你可以使用np.select() ,这样你就不必像我所做的那样不断重复你的np.where()语句很多在我的旧代码中。 It's better to use np.select in these situations:在这些情况下最好使用np.select

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
                        'exp_force' : [25,28,82,84], 
                        'left_max'  : [38,38,38,38], 
                        'both_max'  : [90,90,90,90]})
c1 = (df['hand']=='left')
c2 = (df['hand']=='both')
r1 = df['exp_force'] / df['left_max']
r2 = df['exp_force'] / df['both_max']
conditions = [c1,c2]
results = [r1,r2]
df['ratio'] = np.select(conditions,results)
df
Out[430]: 
   hand  exp_force  left_max  both_max     ratio
0  left         25        38        90  0.657895
1  left         28        38        90  0.736842
2  both         82        38        90  0.911111
3  both         84        38        90  0.933333

Enumerate枚举

for i,e in enumerate(df['hand']):
 
  if e == 'left':
    df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'left_max']
  if e == 'both':
    df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'both_max']
df

Output: Output:

    hand    exp_force   left_max    both_max    ratio
0   left    25            38          90      0.657895
1   left    28            38          90      0.736842
2   both    82            38          90      0.911111
3   both    84            38          90      0.933333

You can use the apply() method of your dataframe:您可以使用 dataframe 的apply()方法:

df['ratio'] = df.apply(
    lambda x: x['exp_force'] / x['left_max'] if x['hand']=='left' else x['exp_force'] / x['both_max'],
    axis=1
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM