[英]pandas take pairwise difference between a lot of columns
I have a pandas dataframe having a lot of actual (column name ending with _act
) and projected columns (column name ending with _proj
).我有一个 pandas dataframe 有很多实际(以
_act
结尾的列名)和预计列(以_proj
结尾的列名)。 Other than actual and projected there's also a date
column.除了实际和预计之外,还有一个
date
列。 Now I want to add an error column (in that order, ie, beside its projected column) for all of them.现在我想为所有这些添加一个错误列(按该顺序,即在其投影列旁边)。 Sample dataframe:
样品 dataframe:
date a_act a_proj b_act b_proj .... z_act z_proj
2020 10 5 9 11 .... 3 -1
.
.
What I want:我想要的是:
date a_act a_proj a_error b_act b_proj b_error .... z_act z_proj z_error
2020 10 5 5 9 11 -2 .... 3 -1 4
.
.
What's the best way to achieve this, as I have a lot of actual and projected columns?实现这一目标的最佳方法是什么,因为我有很多实际和预计的列?
You could do:你可以这样做:
df = df.set_index('date')
# create new columns
columns = df.columns[df.columns.str.endswith('act')].str.replace('act', 'error')
# compute differences
diffs = pd.DataFrame(data=df.values[:, ::2] - df.values[:, 1::2], index=df.index, columns=columns)
# concat
res = pd.concat((df, diffs), axis=1)
# reorder columns
res = res.reindex(sorted(res.columns), axis=1)
print(res)
Output Output
a_act a_error a_proj b_act b_error b_proj z_act z_error z_proj
date
2020 10 5 5 9 -2 11 3 4 -1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.