I have two dataframes with one having
Title Name Quantity ID
as the columns
and the 2nd dataframe has
ID Quantity
as the columns with lesser number of rows than first dataframe .
I need to find the difference between the Quantity of both dataframes based the match in the ID columns and I want to store this difference in a seperate column in the first dataframe .
I tried this (did't work) :
DF1[['ID','Quantity']].reset_index(drop=True).apply(lambda id_qty_tup : DF2[DF2.ID==asin_qty_tup[0]].quantity - id_qty_tup[1] , axis = 1)
Another approach is to apply the ID and quantity of DF1 and iterate through each row of DF2 but it takes more time . Im sure there is a better way .
You can perform index-aligned subtraction, and pandas takes care of the rest.
df['Diff'] = df.set_index('ID').Quantity.sub(df2.set_index('ID').Quantity).values
Demo
Here, changetype is the index, and I've already set it, so pd.Series.sub
will align subtraction by default. Otherwise, you'd need to set the index as above.
df1
strings test
changetype
0 a very -1.250150
1 very boring text -1.376637
2 I cannot read it -1.011108
3 Hi everyone -0.527900
4 please go home -1.010845
5 or I will go 0.008159
6 now -0.470354
df2
strings test
changetype
0 a very very boring text 0.625465
1 I cannot read it -1.487183
2 Hi everyone 0.292866
3 please go home or I will go now 1.430081
df1.test.sub(df2.test)
changetype
0 -1.875614
1 0.110546
2 -1.303974
3 -1.957981
4 NaN
5 NaN
6 NaN
Name: test, dtype: float64
You can use map
in this case:
df['diff'] = df['ID'].map(df2.set_index('ID').Quantity) - df.Quantity
import pandas as pd
df = pd.DataFrame({'Title': ['A', 'B', 'C', 'D', 'E'],
'Name': ['AA', 'BB', 'CC', 'DD', 'EE'],
'Quantity': [1, 21, 14, 15, 611],
'ID': ['A1', 'A1', 'B2', 'B2', 'C1']})
df2 = pd.DataFrame({'Quantity': [11, 51, 44],
'ID': ['A1', 'B2', 'C1']})
We will use df2
to create a dictionary which can be used to map ID
to Quantity
. So anywhere there is an ID==A1
in df
it gets assigned the Quantity 11, B2
gets assigned 51 and C1
gets assigned 44. Here' I'll add it as another column just for illustration purposes.
df['Quantity2'] = df['ID'].map(df2.set_index('ID').Quantity)
print(df)
ID Name Quantity Title Quantity2
0 A1 AA 1 A 11
1 A1 BB 21 B 11
2 B2 CC 14 C 51
3 B2 DD 15 D 51
4 C1 EE 611 E 44
Name: ID, dtype: int64
Then you can just subtract df['Quantity']
and the column we just created to get the difference. (Or subtract that from df['Quantity']
if you want the other difference)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.