如果ID存在於其他數據幀中，則Python Pandas數據幀會在新列中添加“1”

Question

我有兩個帶有客戶ID的數據框（標記為“C_ID”）以及一年的訪問次數。

如果客戶也在2009年購物，我想在我的2010數據框中添加一列。所以我需要創建一個循環來檢查2010年的C_ID是否存在於2009年，添加1，否則為0。

我使用此代碼並沒有工作:(沒有錯誤消息，沒有任何反應）

for row in df_2010.iterrows():
    #check if C_ID exists in the other dataframe
    check = df_2009[(df_2009['C_ID'] == row['C_ID'])]

    if check.empty:
        #ID not exist in 2009 file, add 0 in new column
        row['shopped2009'] = 0

    else:
        #ID exists in 2009 file, add 1 into same column
        row['shopped2009'] = 1

Answer 1

你可以使用dataframe.isin（）

% timeit df_2010['new'] = np.where(df_2010['C_ID'].isin(df_2009['C_ID']), 1, 0)

最佳3：每循環384μs

正如@Kris建議的那樣

%timeit df_2010['new'] = (df_2010['C_ID'].isin(df_2009['C_ID'])).astype(int)

最佳3：每循環584μs

要么

df_2010['new'] = df_2010['C_ID'].isin(df_2009['C_ID'])

如果ID存在於其他數據幀中，則Python Pandas數據幀會在新列中添加“1”

問題描述

1 個解決方案

解決方案1
6 已采納 2017-02-06 20:46:52

如果ID存在於其他數據幀中，則Python Pandas數據幀會在新列中添加“1”

問題描述

1 個解決方案

解決方案1 6 已采納 2017-02-06 20:46:52

解決方案1
6 已采納 2017-02-06 20:46:52