[英]Multiply Two different columns from two different Dataframes with specific condition
[英]Calculate difference of two columns from two different dataframes based on condition
我有兩個帶有公共列的數據框。 我想根據第三列的條件創建一個新列,其中包含兩列(每個數據幀中的一個)之間的差異。
df_a:
Time Volume ID
1 5 1
2 6 2
3 7 3
df_b:
Time Volume ID
1 2 2
2 3 1
3 4 3
output 正在向 df_a 添加一個新列,其中兩個 ID 相等的卷列 (df_a.Volume - df_b.Volume) 之間存在差異。
df_a:
Time Volume ID Diff
1 5 1 2
2 6 2 4
3 7 3 3
如果每個 dataframe 中的每行 ID 都是唯一的:
df_a['Diff'] = df_a['Volume'] - df_a['ID'].map(df_b.set_index('ID')['Volume'])
Output:
Time Volume ID Diff
0 1 5 1 2
1 2 6 2 4
2 3 7 3 3
一種選擇是合並 ID 上的兩個 dfs,然后計算 Diff:
df_a = df_a.merge(df_b.drop(['Time'], axis=1), on="ID", suffixes=['', '2'])
df_a['Diff'] = df_a['Volume'] - df_a['Volume2']
東風:
Time Volume ID Volume2 Diff
0 1 5 1 3 2
1 2 6 2 2 4
2 3 7 3 4 3
合並'ID'上的兩個數據框,然后取差:
import pandas as pd
df_a = pd.DataFrame({'Time': [1,2,3], 'Volume': [5,6,7], 'ID':[1,2,3]})
df_b = pd.DataFrame({'Time': [1,2,3], 'Volume': [2,3,4], 'ID':[2,1,3]})
merged = pd.merge(df_a,df_b, on = 'ID')
df_a['Diff'] = merged['Volume_x'] - merged['Volume_y']
print(df_a)
#output:
Time Volume ID Diff
0 1 5 1 2
1 2 6 2 4
2 3 7 3 3
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.