[英]How to calculate percentage difference between two data frames with Pandas?
[英]How to calculate percentage difference between two data frames with Pandas and return column and row names?
我用 pandas
例如,我有 2 个具有相同列名和行名的 df
df1 name a b 0 row1 3 2 1 row2 7 2 2 row3 1 6
df2 name a b 0 row1 4 2 1 row2 7 2 2 row3 2 3
我需要计算百分比差异,比较 2 df 并在差异超过 30% 时以这种格式返回答案(增加或减少并不重要):
“第 2 行”:“a”增加 33.3%“第 3 行”:“a”增加 50%,“b”减少 50%
将name
列定义为两个数据框的索引,然后合并它们:
df1 = df1.set_index('name').stack()
df2 = df2.set_index('name').stack()
out = pd.concat([df1, df2], keys=['l', 'r'],axis=1) \
.pct_change(axis=1).mul(100) \
.query('r.abs() >= 30')['r']
Output:
>>> out
out = pd.concat([df1, df2], keys=['l', 'r'],axis=1) \
.pct_change(axis=1).mul(100) \
.query('r.abs() >= 30')['r']
Output:
>>> out
name
row1 a 33.333333
row3 a 100.000000
b -50.000000
Name: r, dtype: float64
格式化:
lst = list(out.index.get_level_values(0) + ': '
+ out.index.get_level_values(1)
+ np.where(out > 0, ' increase by ', ' decrease by ')
+ out.map(lambda x: f"{x:.2f}%"))
print(lst)
# Output
['row1: a increase by 33.33%',
'row3: a increase by 100.00%',
'row3: b decrease by -50.00%']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.