[英]Compare Performance of df.apply and Column Operations in python pandas
我想知道对数据帧的列执行基本算术运算是按列执行还是通过套用更快完成。 特设的是,我认为按列的速度更快。 但是两种方式都被认为是“向量化”操作。 那么, df.apply
具有可比性?
我们可以尝试一下。 下面的示例演示了按列操作(快得多):
import numpy as np
import pandas as pd
from datetime import datetime
def applywise_duration(df):
start_time = datetime.now()
df['C'] = df.apply(lambda row: row['A'] + row['B'], axis=1)
end_time = datetime.now()
duration = end_time - start_time
return(duration)
def columnwise_duration(df):
start_time = datetime.now()
df['C'] = df['A'] + df['B']
end_time = datetime.now()
duration = end_time - start_time
return(duration)
df_apply = pd.DataFrame(
np.random.randint(0,10000,size=(1000000, 2)),
columns=list('AB')
)
df_vector = df_apply.copy()
applywise_duration = applywise_duration(df_apply)
columnwise_duration = columnwise_duration(df_vector)
print('Duration of apply: ', applywise_duration)
print('Duration of columnwise addition: ', columnwise_duration)
print('Ratio: ', columnwise_duration / applywise_duration)
print('That means, in this case, columnwise addition is %s times faster '
'than addition via apply!'
% str(applywise_duration / columnwise_duration)
)
Thsis在我的计算机上提供了以下内容:
Duration of apply: 0:00:23.631236
Duration of columnwise addition: 0:00:00.004234
Ratio: 0.00017916963801639492
That means, columnwise addition is 5581.302786962683 times faster than addition via apply!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.