I have a Data set which looks like this:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time
3 1/2/2019 A A 32.08 76.64 5 200
3 1/3/2019 A A 55.95 41.18 10 100
3 1/4/2019 A A 56.61 50 5 300
3 1/4/2019 A B 56.61 35.67 10 300
In the output field i want two columns new_Eff and new_time which will be calculated from the above data, the logic to calculate New_time and New_eff is: 1, If for ID and Date combination if there is one line item then 1.Eff will be equal to new_eff and new_time will be equal to time. So for row 1 and Row2 output will be
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/2/2019 A A 32.08 76.64 5 200 32.08 200
3 1/3/2019 A A 55.95 41.18 10 100 55.95 100
In row3 and row4 date is not changing so new_Eff will be equal to Day-1 2.Eff and time will be equal to Time/Qty, so row 3 output will be:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/4/2019 A A 56.61 50 5 300 300/5=60 41.48 (on 3rd 2.Eff is 41.48)
And in row 4 new_time will be total time-60=300-60=240 and new_eff will be new_time/Qty=240/5=48
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/4/2019 A B 56.61 35.67 10 300 240 48
so output table will look like this:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/2/2019 A A 32.08 76.64 5 200 32.08 200
3 1/3/2019 A A 55.95 41.18 10 100 55.95 100
3 1/4/2019 A A 56.61 50 5 300 60 41.48
3 1/4/2019 A B 56.61 35.67 10 300 240 48
Can anyone please help me how to perform these back calculations when there are multiple rows for same date..
Thanks in advance
you can do it using groupby function like below
df = pd.DataFrame([[3, '1/2/2019', 'A', 'A', 32.08, 76.64, 5, 200], [3, '1/3/2019', 'A', 'A', 55.95, 41.18, 10, 100], [3, '1/4/2019', 'A', 'A', 56.61, 50.0, 5, 300], [3, '1/4/2019', 'A', 'B', 56.61, 35.67, 10, 300]], columns=('ID', 'Date', 'Input1', 'Input2', '1.Eff', '2.Eff', 'Qty', 'Time'))
def calc(g):
if len(g)>1:
fst = g.iloc[[0]]
other = g.iloc[1:]
fst["new_time"] = fst["Time"]/fst["Qty"]
fst["new_EFf"] = fst["2.Eff"]
other["new_time"] = other["Time"]-60
other["new_EFf"] = other["new_time"].values/fst["Qty"].values
g = pd.concat([fst,other], axis=0)
else:
g = g.copy()
g["new_time"] = g["Time"]
g["new_EFf"] = g["1.Eff"]
return g
df.groupby(["ID", "Date"]).apply(calc).reset_index(drop=True)
Result
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
0 3 1/2/2019 A A 32.08 76.64 5 200 200.0 32.08
1 3 1/3/2019 A A 55.95 41.18 10 100 100.0 55.95
2 3 1/4/2019 A A 56.61 50.00 5 300 60.0 50.00
3 3 1/4/2019 A B 56.61 35.67 10 300 240.0 48.00
Here's my approach with np.select
:
first = df['Date'] == df['Date'].shift(-1)
second = first.shift(fill_value=False)
df['new_time'] = np.select((first, second),
(df['Time']/df['Qty'], df['Time']-60),
df['Time'])
df['new_Eff'] = np.select((first, second),
(df['2.Eff'].shift(), df['new_time']/df['Qty']),
df['1.Eff']
)
Output:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_Eff
0 3 1/2/2019 A A 32.08 76.64 5 200 200.0 32.08
1 3 1/3/2019 A A 55.95 41.18 10 100 100.0 55.95
2 3 1/4/2019 A A 56.61 50.00 5 300 60.0 41.18
3 3 1/4/2019 A B 56.61 35.67 10 300 240.0 24.00
Note : From your question And in row 4 new_time will be total time-60=300-60=240
and new_eff
will be new_time/Qty=240/5=48
Should Qty
in row4
is 10
and new_Eff
is 24
as showed in my answer?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.