简体   繁体   English

熊猫数据框任意两列之间的百分比差异

[英]Percentage difference between any two columns of pandas dataframe

I would like to have a function defined for percentage diff calculation between any two pandas columns.我想为任意两个 Pandas 列之间的百分比差异计算定义一个函数。 Lets say that my dataframe is defined by:假设我的数据框由以下定义:

R1  R2    R3    R4   R5    R6
 A   B     1     2    3     4

I would like my calculation defined as我希望我的计算定义为

df['R7'] = df[['R3','R4']].apply( method call to calculate perc diff)

and

df['R8'] = df[['R5','R6']].apply(same method call to calculate perc diff)

How to do?怎么做?

I have tried below我在下面试过

df['perc_cnco_error'] = df[['CumNetChargeOffs_x','CumNetChargeOffs_y']].apply(lambda x,y: percCalc(x,y))

def percCalc(x,y):
    if x<1e-9:
        return 0
    else:
        return (y - x)*100/x

and it gives me the error message它给了我错误信息

TypeError: ('() takes exactly 2 arguments (1 given)', u'occurred at index CumNetChargeOffs_x')类型错误:('() 正好有 2 个参数(给定 1 个)',你发生在索引 CumNetChargeOffs_x')

At it's simplest terms, is this what you're looking for?用最简单的术语来说,这就是你要找的吗?

def percentage_change(col1,col2):
    return ((col2 - col1) / col1) * 100

You can apply it to any 2 columns of your dataframe:您可以将其应用于数据框的任何 2 列:

df['a'] = percentage_change(df['R3'],df['R4'])    
df['b'] =  percentage_change(df['R6'],df['R5'])

Out[220]: 
  R1 R2  R3  R4  R5  R6      a     b
0  A  B   1   2   3   4  100.0 -25.0

This would give you the deviation in percentage:这会给你百分比偏差:

df.apply(lambda row: (row.iloc[0]-row.iloc[1])/row.iloc[0]*100, axis=1)

If you have more than two columns try,如果您有两列以上的尝试,

df[['R3', 'R5']].apply(lambda row: (row.iloc[0]-row.iloc[1])/row.iloc[0]*100, axis=1)

要计算R3R4之间的百分比差异,您可以使用:

df['R7'] = (df.R3 - df.R4) / df.R3 * 100

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM