I have a df which compares the new and old data. Is there a way to calculate the difference between the old and new data? For generality, I don't want to sort the dataframe, but only compare root variables that have a prefix "_old" and "_new"
df
apple_old daily banana_new banana_tree banana_old apple_new
0 5 3 4 2 10 6
for x in df.columns:
if x.endswith("_old") and x.endswith("_new"):
x = x.dif()
Expected Output; brackets are shown just for clarity
df_diff
apple_diff(old-new) banana_diff(old-new)
0 -1 (5-6) 6 (10-4)
Let's try creating a Multi-Index, then subtracting old
from new
.
Setup:
import pandas as pd
df = pd.DataFrame({'apple_old': {0: 5}, 'daily': {0: 3}, 'banana_new': {0: 4},
'banana_tree': {0: 2}, 'banana_old': {0: 10},
'apple_new': {0: 6}})
# Creation of Multi-Index:
df.columns = df.columns.str.rsplit('_', n=1, expand=True).swaplevel(0, 1)
# Subtract old from new:
output_df = (df['old'] - df['new']).add_suffix('_diff')
# Display:
print(output_df)
apple_diff banana_diff
0 -1 6
Multi-Index with str.rsplit
and max split length n=1
so multiple _
are handled safely:
df.columns = df.columns.str.rsplit('_', n=1, expand=True).swaplevel(0, 1)
old NaN new tree old new
apple daily banana banana banana apple
0 5 3 4 2 10 6
Then selection:
df['old']
apple banana
0 5 10
df['new']
banana apple
0 4 6
Subtraction will align by columns. Then add_suffix
to add the _diff
to columns.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.