如何使用多行和多列作為輸入在DataFrame列上應用函數？

Question

我有一系列事件，並基於一些變量（上一個命令，前一個/當前代碼和前一個/當前狀態），我需要決定哪個命令與該事件相關。

我實際上有一個按預期工作的代碼，但它有點慢。 所以我嘗試使用df.apply，但我認為不可能使用比當前元素更多的輸入。 （代碼從1開始，因為第一行始終是“開始”命令）

def mark_commands(df):
    for i in range(1, len(df)):
        prev_command = df.loc[i-1, 'Command']
        prev_code, cur_code = df.loc[i-1, 'Code'], df.loc[i, 'Code']
        prev_status, cur_status = df.loc[i-1, 'Status'], df.loc[i, 'Status']

        if (prev_command == "end" and 
            ((cur_code == 810 and cur_status in [10, 15]) or 
            (cur_code == 830 and cur_status == 15))):

            df.loc[i, 'Command'] = "ignore"

        elif ((cur_code == 800 and cur_status in [20, 25]) or 
            (cur_code in [810, 830] and cur_status in [10, 15])):

            df.loc[i, 'Command'] = "end"

        elif ((prev_code != 800) and 
            ((cur_code == 820 and cur_status == 25) or 
            (cur_code == 820 and cur_status == 20 and 
                prev_code in [810, 820] and prev_status == 20) or 
            (cur_code == 830 and cur_status == 25 and 
                prev_code == 820 and prev_status == 20))):

            df.loc[i, 'Command'] = "continue"

        else:

            df.loc[i, 'Command'] = "begin"

    return df

這里有一個CSV格式的正確標記的樣本（可以作為輸入，因為唯一的區別是第一次開始后命令行上的所有內容都是空的）：

Code,Status,Command
810,20,begin
810,10,end
810,25,begin
810,15,end
810,15,ignore
810,20,begin
810,10,end
810,25,begin
810,15,end
810,15,ignore
810,20,begin
800,20,end
810,10,ignore
810,25,begin
820,25,continue
820,25,continue
820,25,continue
820,25,continue
800,25,end

Answer 1

你的代碼大部分是完美的（你可以使用df.iterrows() ，如果你的索引不是線性的，那么在for循環中會更加防彈，但它不會改變速度）。

在廣泛嘗試使用df.apply ，我意識到由於您的"Command"列不斷從一行更新到另一行，因此存在致命流。 以下是行不通的，因為df在某種程度上是“靜態的”：

df['Command'] = df.apply(lambda row: mark_commands(row), axis=1)

最后，為了節省一些計算，如果你的if ， elif語句直接進入下一次迭代，你可以在每次條件滿足時插入一個continue語句：

if (prev_command == "end" and ....) :
    df.loc[i, 'Command'] = "ignore"
    continue

話雖這么說，你的代碼很棒。

如何使用多行和多列作為輸入在DataFrame列上應用函數？

問題描述

1 個解決方案

解決方案1
0 2019-06-18 23:52:40

如何使用多行和多列作為輸入在DataFrame列上應用函數？

問題描述

1 個解決方案

解決方案1 0 2019-06-18 23:52:40

解決方案1
0 2019-06-18 23:52:40