[英]How can I create a function to make this step easier in Pandas?
I have a DataFrame that looks like this below:我有一个 DataFrame,如下所示:
ID ![]() |
Clicks![]() |
Clicks_GA![]() |
Discrep_%![]() |
Discrep_Found ![]() |
---|---|---|---|---|
5939 ![]() |
18482 ![]() |
18480 ![]() |
.01 ![]() |
False![]() |
#Calculates the discrepancy % (I also import numpy as np) #计算差异百分比(我还将 numpy 作为 np 导入)
df['Discrep_%'] = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)
#Returns true or false if the discrepancy is less than the abs value of 5% #如果差异小于5%的abs值,则返回真或假
df['Discrep_Found'] = (df['Discrep_%'] >.05)
The problem is that I have multiple dataframes, and I don't want to copy and paste the same line of code a bunch of times.问题是我有多个数据框,我不想多次复制和粘贴同一行代码。
Is there a function I can use to make this process simpler?我可以使用 function 来简化此过程吗?
Thanks!谢谢!
Try this:尝试这个:
def count_some(df):
val = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)
return val, val > .05
df[["Discrep_%", "Discrep_Found"]] = df.apply(count_some, axis=1, result_type='expand')
You could loop through the DataFrames.您可以遍历 DataFrames。 For example:
例如:
for df in [df1, df2, df3, ...]:
df['Discrep_%'] = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)
df['Discrep_Found'] = (df['Discrep_%'] > .05)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.