[英]Pandas DataFrame check colums value in other column
I have test_df with columns 'MonthAbbr' and 'PromoInterval' 我有test_df列'MonthAbbr'和'PromoInterval'
Example output 输出示例
1017174 Jun Mar,Jun,Sept,Dec
1017175 Mar Mar,Jun,Sept,Dec
1017176 Feb Mar,Jun,Sept,Dec
1017177 Feb Feb,May,Aug,Nov
1017178 Jan Feb,May,Aug,Nov
1017179 Jan Mar,Jun,Sept,Dec
1017180 Jan Mar,Jun,Sept,Dec
I want add column-indicator is month in promo interval, which will =1 if MonthAbbr in PromoInterval for current row, =0 otherwise 我想在促销间隔中添加column-indicator是月份,如果当前行在PromoInterval中为MonthAbbr,则它将为= 1,否则为= 0
Is there more efficient way? 有没有更有效的方法?
for ind in test_df.index:
test_df.set_value(ind ,'IsPromoInThisMonth',
test_df.MonthAbbr.astype(str)[ind] in (test_df.PromoInterval.astype(str)[ind])
This is a bit faster: 这有点快:
%%timeit
test_df['IsPromoInThisMonth'] = [x in y for x, y in zip(test_df['MonthAbbr'],
test_df['PromoInterval'])]
1000 loops, best of 3: 317 µs per loop
Than your approach: 比您的方法:
%%timeit
for ind in test_df.index:
test_df.set_value(ind ,'IsPromoInThisMonth',
test_df.MonthAbbr.astype(str)[ind] in (test_df.PromoInterval.astype(str)[ind]))
1000 loops, best of 3: 1.44 ms per loop
UPDATE 更新
Using a function with apply
is slower than the list comprehension: 将函数与apply
一起apply
比列表理解要慢:
%%timeit
test_df['IsPromoInThisMonth'] = test_df.apply(lambda x: x[0] in x[1], axis=1)
1000 loops, best of 3: 804 µs per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.