I have 2 padas dataframe
df1 = pd.DataFrame({'id': [1001,1002,1004],
'col1': ["a","b","d"],
'col2': [1,2,6]})
df2 = pd.DataFrame({'id': [1001,1002,1003,1004,1005,1006,1007],
'a': [10,10,10,10,10,10,10],
'b': [2,2,2,2,2,2,2],
'c': [1,2,3,4,5,6,7],
'd': [5,5,5,5,5,4,5],
'e': [0,3,4,6,7,5,5]})
df1 and df2 has common id present.
whenever value appears in df1.col1 (for example "a") subtract df1.col2 value from corresponding df2 id and columnname =df1.col1 value ("a") .
Above statement may be confusing but I will try to explain with an example:
df1 id 1001 has col1=a and col2=1
what I want to do is I want to subtract 1 from column a of df2 for id 1001 =10-1=9
in another example
df1 id =1004 has col1 value=d and col2=6 so subtract 6 from column d of df2 corresponding to id 1004 =5-6=-1 Final outcome will look like this
a b c d e id
0 9 2 1 5 0 1001
1 10 0 2 5 3 1002
2 10 2 3 5 4 1003
3 10 2 4 -1 6 1004
4 10 2 5 5 7 1005
5 10 2 6 4 5 1006
6 10 2 7 5 5 1007
How should I got about solving this in Pandas in efficient manner since I have to repeat this exercise for quite a number of times on big datasets.
Thanks in advance
use lambda
as follows
df1.apply(lambda row : updateDF2(row), axis=1)
full sample code is
import pandas as pd
df1 = pd.DataFrame({'id': [1001,1002,1004],
'col1': ["a","b","d"],
'col2': [1,2,6]})
df2 = pd.DataFrame({'id': [1001,1002,1003,1004,1005,1006,1007],
'a': [10,10,10,10,10,10,10],
'b': [2,2,2,2,2,2,2],
'c': [1,2,3,4,5,6,7],
'd': [5,5,5,5,5,4,5],
'e': [0,3,4,6,7,5,5]})
def updateDF2(row):
df2.loc[df2["id"] == row["id"], row["col1"]] -= row["col2"]
#df1.apply(lambda row : updateDF2(row), axis=1)
df1.apply(updateDF2, axis=1)
print(df2)
output is
a b c d e id
0 9 2 1 5 0 1001
1 10 0 2 5 3 1002
2 10 2 3 5 4 1003
3 10 2 4 -1 6 1004
4 10 2 5 5 7 1005
5 10 2 6 4 5 1006
6 10 2 7 5 5 1007
[Finished in 0.9s]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.