简体   繁体   中英

back calculation in excel columns in python

I have a Data set which looks like this:

ID    Date       Input1    Input2   1.Eff   2.Eff    Qty     Time   
3   1/2/2019        A        A      32.08   76.64     5      200
3   1/3/2019        A        A      55.95   41.18     10     100
3   1/4/2019        A        A      56.61    50        5     300
3   1/4/2019        A        B      56.61   35.67     10     300

In the output field i want two columns new_Eff and new_time which will be calculated from the above data, the logic to calculate New_time and New_eff is: 1, If for ID and Date combination if there is one line item then 1.Eff will be equal to new_eff and new_time will be equal to time. So for row 1 and Row2 output will be

   ID     Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf
    3   1/2/2019        A        A      32.08   76.64     5      200      32.08       200
    3   1/3/2019        A        A      55.95   41.18     10     100      55.95       100

In row3 and row4 date is not changing so new_Eff will be equal to Day-1 2.Eff and time will be equal to Time/Qty, so row 3 output will be:

ID    Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf 
3   1/4/2019        A        A      56.61   50       5      300     300/5=60     41.48 (on 3rd 2.Eff is 41.48)

And in row 4 new_time will be total time-60=300-60=240 and new_eff will be new_time/Qty=240/5=48

ID    Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf 
3   1/4/2019        A        B      56.61   35.67     10     300      240          48

so output table will look like this:

  ID      Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf
    3   1/2/2019        A        A      32.08   76.64     5      200      32.08       200
    3   1/3/2019        A        A      55.95   41.18     10     100      55.95       100
    3   1/4/2019        A        A      56.61   50        5      300        60        41.48
    3   1/4/2019        A        B      56.61   35.67     10     300       240          48

Can anyone please help me how to perform these back calculations when there are multiple rows for same date..

Thanks in advance

you can do it using groupby function like below


df = pd.DataFrame([[3, '1/2/2019', 'A', 'A', 32.08, 76.64, 5, 200], [3, '1/3/2019', 'A', 'A', 55.95, 41.18, 10, 100], [3, '1/4/2019', 'A', 'A', 56.61, 50.0, 5, 300], [3, '1/4/2019', 'A', 'B', 56.61, 35.67, 10, 300]], columns=('ID', 'Date', 'Input1', 'Input2', '1.Eff', '2.Eff', 'Qty', 'Time'))

def calc(g):

    if len(g)>1:
        fst = g.iloc[[0]]
        other = g.iloc[1:]
        fst["new_time"] = fst["Time"]/fst["Qty"]
        fst["new_EFf"] = fst["2.Eff"]

        other["new_time"] = other["Time"]-60
        other["new_EFf"] = other["new_time"].values/fst["Qty"].values
        g = pd.concat([fst,other], axis=0)

    else:
        g = g.copy()
        g["new_time"] = g["Time"]
        g["new_EFf"] = g["1.Eff"]

    return g

df.groupby(["ID", "Date"]).apply(calc).reset_index(drop=True)

Result

 ID      Date Input1 Input2  1.Eff  2.Eff  Qty  Time  new_time  new_EFf
0   3  1/2/2019      A      A  32.08  76.64    5   200     200.0    32.08
1   3  1/3/2019      A      A  55.95  41.18   10   100     100.0    55.95
2   3  1/4/2019      A      A  56.61  50.00    5   300      60.0    50.00
3   3  1/4/2019      A      B  56.61  35.67   10   300     240.0    48.00

Here's my approach with np.select :

first = df['Date'] == df['Date'].shift(-1)
second = first.shift(fill_value=False)

df['new_time'] = np.select((first, second),
                           (df['Time']/df['Qty'], df['Time']-60),
                           df['Time'])

df['new_Eff'] = np.select((first, second),
                          (df['2.Eff'].shift(), df['new_time']/df['Qty']),
                          df['1.Eff']
                         )

Output:

   ID      Date Input1 Input2  1.Eff  2.Eff  Qty  Time  new_time  new_Eff
0   3  1/2/2019      A      A  32.08  76.64    5   200     200.0    32.08
1   3  1/3/2019      A      A  55.95  41.18   10   100     100.0    55.95
2   3  1/4/2019      A      A  56.61  50.00    5   300      60.0    41.18
3   3  1/4/2019      A      B  56.61  35.67   10   300     240.0    24.00

Note : From your question And in row 4 new_time will be total time-60=300-60=240 and new_eff will be new_time/Qty=240/5=48 Should Qty in row4 is 10 and new_Eff is 24 as showed in my answer?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM