简体   繁体   English

pandas 在很多列之间取成对差异

[英]pandas take pairwise difference between a lot of columns

I have a pandas dataframe having a lot of actual (column name ending with _act ) and projected columns (column name ending with _proj ).我有一个 pandas dataframe 有很多实际(以_act结尾的列名)和预计列(以_proj结尾的列名)。 Other than actual and projected there's also a date column.除了实际和预计之外,还有一个date列。 Now I want to add an error column (in that order, ie, beside its projected column) for all of them.现在我想为所有这些添加一个错误列(按该顺序,即在其投影列旁边)。 Sample dataframe:样品 dataframe:

date a_act a_proj b_act b_proj .... z_act z_proj
2020  10     5      9     11   ....   3     -1
.
.

What I want:我想要的是:

date a_act a_proj a_error b_act b_proj b_error .... z_act z_proj z_error
2020  10     5       5      9     11     -2    ....   3     -1     4
.
.

What's the best way to achieve this, as I have a lot of actual and projected columns?实现这一目标的最佳方法是什么,因为我有很多实际和预计的列?

You could do:你可以这样做:

df = df.set_index('date')

# create new columns
columns = df.columns[df.columns.str.endswith('act')].str.replace('act', 'error')

# compute differences
diffs = pd.DataFrame(data=df.values[:, ::2] - df.values[:, 1::2], index=df.index, columns=columns)

# concat
res = pd.concat((df, diffs), axis=1)

# reorder columns
res = res.reindex(sorted(res.columns), axis=1)
print(res)

Output Output

      a_act  a_error  a_proj  b_act  b_error  b_proj  z_act  z_error  z_proj
date                                                                        
2020     10        5       5      9       -2      11      3        4      -1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM