如果ID存在于其他数据帧中，则Python Pandas数据帧会在新列中添加“1”

Question

I have two dataframes with customer IDs (labeled "C_ID") and with the number of visits for a year. 我有两个带有客户ID的数据框（标记为“C_ID”）以及一年的访问次数。

I want to add a column in my 2010 dataframe, if the customer also shopped in 2009. So I need to create a loop checking if the C_ID from 2010 exist in 2009, add a 1, otherwise a 0. 如果客户也在2009年购物，我想在我的2010数据框中添加一列。所以我需要创建一个循环来检查2010年的C_ID是否存在于2009年，添加1，否则为0。

I used this code and didn't work: (no error message, nothing happens) 我使用此代码并没有工作:(没有错误消息，没有任何反应）

for row in df_2010.iterrows():
    #check if C_ID exists in the other dataframe
    check = df_2009[(df_2009['C_ID'] == row['C_ID'])]

    if check.empty:
        #ID not exist in 2009 file, add 0 in new column
        row['shopped2009'] = 0

    else:
        #ID exists in 2009 file, add 1 into same column
        row['shopped2009'] = 1

Answer 1

You can use dataframe.isin() 你可以使用dataframe.isin（）

% timeit df_2010['new'] = np.where(df_2010['C_ID'].isin(df_2009['C_ID']), 1, 0)

best of 3: 384 µs per loop 最佳3：每循环384μs

As @Kris suggested 正如@Kris建议的那样

%timeit df_2010['new'] = (df_2010['C_ID'].isin(df_2009['C_ID'])).astype(int)

best of 3: 584 µs per loop 最佳3：每循环584μs

Or 要么

df_2010['new'] = df_2010['C_ID'].isin(df_2009['C_ID'])

如果ID存在于其他数据帧中，则Python Pandas数据帧会在新列中添加“1”

问题描述

1 个解决方案

解决方案1
6 已采纳 2017-02-06 20:46:52

如果ID存在于其他数据帧中，则Python Pandas数据帧会在新列中添加“1”

问题描述

1 个解决方案

解决方案1 6 已采纳 2017-02-06 20:46:52

解决方案1
6 已采纳 2017-02-06 20:46:52