Pandas 根据 3 个不同列中的值添加带有 groupby 的新列

Question

I have the following df:我有以下df：

    Document    Date    Schedule    Quantity    Key
0   123      2020-12-02    1          20         1
1   123      2020-12-02    2          10         0
2   123      2020-12-02    3           5         0
3   456      2020-12-02    4          10         0

I want to add a new column: grouped by Document and Date, if the quantity in row 0 (where Key = 1) is different from the quantity in the column with the lowest value in schedule (excluding row 0) and where key = 0, New_Col = 1. If quantities are the same, New_Col = 0.我想添加一个新列：按文档和日期分组，如果第 0 行（其中 Key = 1）中的数量与计划中具有最低值（不包括第 0 行）且 key = 0 的列中的数量不同, New_Col = 1。如果数量相同，New_Col = 0。

Desired output:所需的 output：

    Document    Date    Schedule    Quantity    Key   New_Col
0   123      2020-12-02    1          20         1       1
1   123      2020-12-02    2          10         0       0
2   123      2020-12-02    3           5         0       0
3   456      2020-12-02    4          10         0       0

Answer 1

Define the following function:定义如下 function：

def getNewCol(grp):
    rv = pd.Series(0, index=grp.index)
    # Quantity from row with Key == 1 (a Series)
    qn = grp.query('Key == 1').Quantity
    if qn.size == 0:   # Nothing found
        return rv
    qnK1 = qn.iloc[0]  # The Quantity itself
    # Min Schedule from "other" rows
    schMin = grp.query('Key != 1').Schedule.min()
    # Quantity from this row
    qnMin = grp.query('Schedule == @schMin').Quantity.iloc[0]
    if qnK1 != qnMin:  # Different
        rv.iloc[0] = 1 # Set the first element of the result
    return rv

Then apply it and save the result in a new column:然后应用它并将结果保存在新列中：

df['New_Col'] = df.groupby(['Document', 'Date'], as_index=False)\
    .apply(getNewCol).reset_index(level=0, drop=True)

The result is:结果是：

   Document       Date  Schedule  Quantity  Key  New_Col
0       123 2020-12-02         1        20    1        1
1       123 2020-12-02         2        10    0        0
2       123 2020-12-02         3         5    0        0
3       456 2020-12-02         4        10    0        0

Pandas 根据 3 个不同列中的值添加带有 groupby 的新列

问题描述

1 个解决方案

解决方案1
1 2020-12-03 16:56:34

Pandas 根据 3 个不同列中的值添加带有 groupby 的新列

问题描述

1 个解决方案

解决方案1 1 2020-12-03 16:56:34

解决方案1
1 2020-12-03 16:56:34