I am trying to solve a problem at hand as explained. I have a Dataframe as shown below:
Date Item Type Qty Price
1/1/18 Orange Add 100 25
5/1/18 Orange Add 20 40
8/1/18 Orange Add 40 20
18/1/18 Orange Add 10 35
27/2/18 Orange Sub 100 55
15/4/18 Orange Sub 30 45
and I want to get the intermediate Dataframe like below:
Date Item Type Qty Price Diff
1/1/18 Orange Add 0 25 30
5/1/18 Orange Add 0 40 5
8/1/18 Orange Add 30 20 25
18/1/18 Orange Add 10 35
and then the final Dataframe I want it like this below:
Date Item Type Qty Price
8/1/18 Orange Add 30 20
18/1/18 Orange Add 10 35
NOTE: Diff is a difference of Sub and Add Price. And Qty is also updated with Qty of Sub subtracted from Qty of Add.
Could anyone of you please help with the way it can be achieved. I was trying with groupby, apply and transform but till now I have not got this.
I have below code, still in development and not complete:
def FruitSummary():
df = pd.DataFrame([
['01/1/18', 'Orange', 'Add', 100, 25],
['05/1/18', 'Orange', 'Add', 20, 40],
['08/1/18', 'Orange', 'Add', 40, 20],
['18/1/18', 'Orange', 'Add', 10, 35],
['27/2/18', 'Orange', 'Sub', 100, 55],
['15/4/18', 'Orange', 'Sub', 30, 45],
['02/1/18', 'Banana', 'Add', 110, 7],
['04/1/18', 'Banana', 'Add', 20, 9],
['11/1/18', 'Banana', 'Add', 40, 4],
['10/2/18', 'Banana', 'Add', 10, 3],
['15/3/18', 'Banana', 'Sub', 100, 9],
['15/4/18', 'Banana', 'Sub', 50, 8],
['10/3/18', 'Kiwi', 'Add', 80, 29],
['12/3/18', 'Berry', 'Add', 25, 5],
['18/4/18', 'Berry', 'Add', 15, 8]],
columns=['Date', 'Item', 'Type', 'Qty', 'Price'])
print(df)
def fruit_stat(dfIN):
print(dfIN)
print((dfIN['Type'] == 'Sub').unique(), (dfIN['Type'] == 'ODD').unique())
if len(dfIN) > 1 and (True in (dfIN['Type'] == 'Sub').unique()):
print(dfIN['Item'].iloc[1], "'len > 1'", "'Sub True'")
dfFS = df.groupby(['Item']).apply(fruit_stat)
print(dfFS)
I am able to find some solution, not sure if it is optimal or there might be better solution for the same.
df = pd.DataFrame([['01/1/18', 'Orange', 'Add', 100, 25],
['05/1/18', 'Orange', 'Add', 20, 40],
['08/1/18', 'Orange', 'Add', 40, 20],
['18/1/18', 'Orange', 'Add', 10, 35],
['27/2/18', 'Orange', 'Sub', 100, 55],
['15/4/18', 'Orange', 'Sub', 30, 45],
['02/1/18', 'Banana', 'Add', 110, 7],
['04/1/18', 'Banana', 'Add', 20, 9],
['11/1/18', 'Banana', 'Add', 40, 4],
['10/2/18', 'Banana', 'Add', 10, 3],
['15/3/18', 'Banana', 'Sub', 100, 9],
['15/4/18', 'Banana', 'Sub', 50, 8],
['10/3/18', 'Kiwi', 'Add', 80, 29],
['12/3/18', 'Berry', 'Add', 25, 5],
['18/4/18', 'Berry', 'Add', 15, 8],
['16/3/18', 'Cherry', 'Add', 25, 5],
['21/4/18', 'Cherry', 'Sub', 25, 8],
['19/3/18', 'Grapes', 'Add', 25, 5],
['23/4/18', 'Grapes', 'Sub', 15, 8]],
columns=['Date', 'Item', 'Type', 'Qty', 'Price'])
def FruitSummary(df):
df['CumSum'] = df.groupby(['Item', 'Type'])['Qty'].cumsum()
print(df)
def fruit_stat(dfg):
if dfg[dfg['Type'] == 'Sub']['Qty'].count():
subT = dfg[dfg['Type'] == 'Sub']['CumSum'].iloc[-1]
dfg['Qty'] = np.where((dfg['CumSum'] - subT) <= 0, 0, dfg['Qty'])
dfg = dfg[dfg['Qty'] > 0]
if(len(dfg) > 0):
dfg['Qty'].iloc[0] = dfg['CumSum'].iloc[0] - subT
return dfg
dfFS = df.groupby(['Item'], as_index=False).apply(fruit_stat).drop(['CumSum'], axis=1).reset_index(drop=True)
print(dfFS)
And the above code produces the answer like this below:
Date Item Type Qty Price
0 11/1/18 Banana Add 20 4
1 10/2/18 Banana Add 10 3
2 12/3/18 Berry Add 25 5
3 18/4/18 Berry Add 15 8
4 19/3/18 Grapes Add 10 5
5 10/3/18 Kiwi Add 80 29
6 08/1/18 Orange Add 30 20
7 18/1/18 Orange Add 10 35
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.