简体   繁体   English

熊猫DataFrame检查其他列中的列值

[英]Pandas DataFrame check colums value in other column

I have test_df with columns 'MonthAbbr' and 'PromoInterval' 我有test_df列'MonthAbbr'和'PromoInterval'

Example output 输出示例

1017174           Jun  Mar,Jun,Sept,Dec
1017175           Mar  Mar,Jun,Sept,Dec
1017176           Feb  Mar,Jun,Sept,Dec
1017177           Feb  Feb,May,Aug,Nov
1017178           Jan  Feb,May,Aug,Nov
1017179           Jan  Mar,Jun,Sept,Dec
1017180           Jan  Mar,Jun,Sept,Dec

I want add column-indicator is month in promo interval, which will =1 if MonthAbbr in PromoInterval for current row, =0 otherwise 我想在促销间隔中添加column-indicator是月份,如果当前行在PromoInterval中为MonthAbbr,则它将为= 1,否则为= 0

Is there more efficient way? 有没有更有效的方法?

for ind in test_df.index:
  test_df.set_value(ind ,'IsPromoInThisMonth',
  test_df.MonthAbbr.astype(str)[ind] in (test_df.PromoInterval.astype(str)[ind])

This is a bit faster: 这有点快:

%%timeit
test_df['IsPromoInThisMonth'] = [x in y for x, y in zip(test_df['MonthAbbr'], 
                                                        test_df['PromoInterval'])]

1000 loops, best of 3: 317 µs per loop

Than your approach: 比您的方法:

%%timeit
for ind in test_df.index:
    test_df.set_value(ind ,'IsPromoInThisMonth',
    test_df.MonthAbbr.astype(str)[ind] in (test_df.PromoInterval.astype(str)[ind]))
1000 loops, best of 3: 1.44 ms per loop

UPDATE 更新

Using a function with apply is slower than the list comprehension: 将函数与apply一起apply比列表理解要慢:

%%timeit
test_df['IsPromoInThisMonth'] = test_df.apply(lambda x: x[0] in x[1], axis=1)

1000 loops, best of 3: 804 µs per loop

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM