Input is 2 pandas Dataframe df1 & df2
df1
Document No Amount
6 8138000628REV 0.00
9 8138000602REV 0.00
24 8138000607REV 310.00
11 8138000605REV 0.00
14 813800602REV 0.00
45 8138000525AREV 0.00
84 8138000861REV 200000.00
87 8138000748REV -80770.82
df2
Document No Amount
2 8138000628 0.00
5 8138000602 0.00
12 8138000605 0.00
16 813800602 0.00
42 8138000525A 0.00
80 8138000861 215208.00
85 8138000748 80770.82
Required Output is based on "Document No". For each "Document No" in df1 if "Document No" not present in df2 then make it the part of df3. If "Document No" is present in df2 and Amount is different in df1, df2 then make it the part of df3 with "Document No" without "REV" keyword from df2 and amount will be the subtraction
df3
Document No Amount
24 8138000607 310.00
84 8138000861 15208.00 -->(215208.00-200000.00)
So far i have tried to achieve my target using dictionary and list using below code snippet and i am able to get the result but I am assuming Pandas does have some great capability to achieve the same with less no of lines of codes. I am not so well versed with Pandas if someone can give me some hint and show me the path to achieve the same using pandas only.
%%time
import pandas as pd
Path_M='somepath'
df_led = pd.read_excel(Path_M + 'ABC Ltd_ recon.xlsx',
usecols = ['Document No','Remaining Amount'],
sheet_name='Ledger')
df_led['combined']=df_led.values.tolist()
list1 = df_led['combined'].tolist()
thisdict_pir={}
for item in list1:
ll_pir=[]
key = item[0]
key=str(key)
if key.endswith('REV'):
if key in thisdict_pir:
var = thisdict_pir[key]
var.append(item)
thisdict_pir[key] = var
else:
ll_pir.append(item)
thisdict_pir[key]=ll_pir
listofdocnumberwithnorev=[]
for item in listofextdocno:
if item.endswith('REV'):
listofdocnumberwithnorev.append(item[:-3])
thisdict_pi={}
for extdocno in listofdocnumberwithnorev:
if extdocno in thisdict:
data=thisdict[extdocno]
thisdict_pi[extdocno]=data
listofextdocnoin=thisdict_pir.keys()
finaldict={}
for inv in listofextdocnoin:
listofpir=thisdict_pir[inv]
#print(listofpir)
if inv[:-3] in thisdict_pi:
listofpi=thisdict_pi[inv[:-3]]
#print(listofpi)
else:
listofpi=[]
print(listofpi)
if (len(listofpir)>0):
#print(listofpir)
amtinvr=0
for pinvr in listofpir:
amtinvr=pinvr[5]+amtinvr
#print(amtinvr)
if (len(listofpi)>0):
#print(listofpi)
amtinv=0
for pinv in listofpi:
amtinv=pinv[5]+amtinv
#print(amtinv)
if abs(amtinvr) != abs(amtinv):
val=pinvr
finaldict[inv]=val
elif len(listofpi)<1:
finaldict[inv]=pinvr
You can merge your 2 dataframes then filter out.
df3 = (
df1.assign(**{'Document No': df1['Document No'].replace('REV$', '', regex=True)})
.merge(df2, how='left', on='Document No', indicator=True, suffixes=('', '2'))
.query("(_merge == 'left_only') | (Amount != -Amount2)")
.assign(Amount=lambda x: x['Amount2'].fillna(2*x['Amount']).sub(x['Amount']))
[['Document No', 'Amount']]
)
Output:
>>> df3
Document No Amount
2 8138000607 310.0
6 8138000861 15208.0
Update To preserve index from df1
(24, 84), use this modified version:
df3 = (
df1.assign(**{'Document No': df1['Document No'].replace('REV$', '', regex=True)})
.reset_index()
.merge(df2, how='left', on='Document No', indicator=True, suffixes=('', '2'))
.query("(_merge == 'left_only') | (Amount != -Amount2)")
.assign(Amount=lambda x: x['Amount2'].fillna(2*x['Amount']).sub(x['Amount']))
.set_index('index')[['Document No', 'Amount']].rename_axis(None)
)
Output:
>>> df3
Document No Amount
24 8138000607 310.0
84 8138000861 15208.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.