apply（）基於條件的數據框上的函數

Question

在下面的函數中， myfun首先檢查是否滿足特定條件，然后繼續操作。

此檢查在函數內部進行。

在應用該功能之前，是否可以進行檢查？

例如， if [column] == xyx, .apply(myfun)

下面的一些代碼：

import pandas as pd

x = pd.DataFrame({'col1':['hi','hello','hi','hello'],
                 'col2':['random', 'words', 'in', 'here']})
print(x)

    col1    col2
0     hi  random
1  hello   words
2     hi      in
3  hello    here

我的函數檢查row['col1'] == 'hi'並返回字符串success else np.nan 。

def myfun(row):

    # if this row contains string 'hi'
    if row['col1'] == 'hi':

        return 'success'

    # otherwise return nan
    else:

        return pd.np.nan

# applying the function
x['result'] = x.apply(myfun,axis=1)


# result

    col1    col2   result
0     hi  random  success
1  hello   words      NaN
2     hi      in  success
3  hello    here      NaN

是否可以僅將函數應用到col1 == 'hi'那些行，而不是在apply()函數內部執行該功能？

注意：我更喜歡使用apply()的解決方案。 我知道還有其他選擇，例如np.where 。

Answer 1

是的，您可以，而且比apply更好，像這樣。

由於apply的每一行上環和loc是一個量化的方法。 即使套用真的很強大，我也會盡量避免套用

x.loc[x['col1']=='hi', 'result'] = 'success'

Answer 2

這是根據條件使用apply()方法。 我現在可以從函數中刪除條件檢查：

def myfun(row):

    return 'success'

# applying the function based on condition
x['result'] = x[x['col1']=='hi'].apply(myfun,axis=1)

我也可以先創建一個蒙版。

mask = (x['col1']=='hi')

# applying the function based on condition
x['result'] = x[mask].apply(myfun,axis=1)

apply（）基於條件的數據框上的函數

問題描述

2 個解決方案

解決方案1
1 2019-08-29 16:23:29

解決方案2
0 已采納 2019-08-29 17:15:33

apply（）基於條件的數據框上的函數

問題描述

2 個解決方案

解決方案1 1 2019-08-29 16:23:29

解決方案2 0 已采納 2019-08-29 17:15:33

解決方案1
1 2019-08-29 16:23:29

解決方案2
0 已采納 2019-08-29 17:15:33