[英]Pandas apply based on conditional from another column
I'm looking to adjust values of one column based on a conditional in another column. 我想根据另一列中的条件调整一列的值。
I'm using np.busday_count, but I don't want the weekend values to behave like a Monday (Sat to Tues is given 1 working day, I'd like that to be 2) 我正在使用np.busday_count,但我不希望周末的值像星期一一样(星期六至星期二有1个工作日,我希望是2个工作日)
dispdf = df[(df.dispatched_at.isnull()==False) & (df.sold_at.isnull()==False)]
dispdf["dispatch_working_days"] = np.busday_count(dispdf.sold_at.tolist(), dispdf.dispatched_at.tolist())
for i in range(len(dispdf)):
if dispdf.dayofweek.iloc[i] == 5 or dispdf.dayofweek.iloc[i] == 6:
dispdf.dispatch_working_days.iloc[i] +=1
Sample: 样品:
dayofweek dispatch_working_days
43159 1.0 3
48144 3.0 3
45251 6.0 1
49193 3.0 0
42470 3.0 1
47874 6.0 1
44500 3.0 1
43031 6.0 3
43193 0.0 4
43591 6.0 3
Expected Results: 预期成绩:
dayofweek dispatch_working_days
43159 1.0 3
48144 3.0 3
45251 6.0 2
49193 3.0 0
42470 3.0 1
47874 6.0 2
44500 3.0 1
43031 6.0 2
43193 0.0 4
43591 6.0 4
At the moment I'm using this for loop to add a working day to Saturday and Sunday values. 目前,我正在使用此for循环将工作日添加到周六和周日值。 It's slow! 太慢了!
Can I use a vectorization instead to speed this up. 我可以使用向量化来加快速度吗? I tried using .apply but to no avail. 我尝试使用.apply,但无济于事。
Pretty sure this works, but there are more optimized implementations: 可以肯定这是可行的,但是还有更多优化的实现:
def adjust_dispatch(df_line):
if df_line['dayofweek'] >= 5:
return df_line['dispatch_working_days'] + 1
else:
return df_line['dispatch_working_days']
df['dispatch_working_days'] = df.apply(adjust_dispatch, axis=1)
for
in you code could be replaced by that line: for
你的代码可以通过该行进行更换:
dispdf.loc[dispdf.dayofweek>5,'dispatch_working_days']+=1
or you could use numpy.where
或者你可以使用numpy.where
https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.