[英]Swifter: what is the difference vectorized and non-vectorized function?
I need to learn about pandas speed optimization.我需要了解熊猫速度优化。 Library that very effective about my problem is swifter.
对我的问题非常有效的图书馆更快。 But i don't understand about the documentation, especially vectorized function.
但我不了解文档,尤其是矢量化函数。
My assumption is swifter input is only accept vector input, not dataframe.我的假设是更快的输入只接受向量输入,而不是数据帧。 is it wrong?
这是错的吗?
in the documentation this is vectorized function:在文档中,这是矢量化函数:
def bikes_proportion(x, max_x):
return x * 1.0 / max_x
and this is non-vectorized function:这是非矢量化函数:
def convert_to_human(datetime):
return datetime.weekday_name + ', the ' + str(datetime.day) + 'th day of ' + datetime.strftime("%B") + ', ' + str(datetime.year)
what is the difference?有什么不同?
Can you tell me what is the different about vectorized and non-vectorized function?你能告诉我矢量化和非矢量化函数有什么不同吗? and if you ever use swifter before.
如果您以前使用过 swifter。 can swifter work with dataframe or it only work with vector?
可以更快地使用数据框还是只能使用矢量?
I am trying my best to explain with simple use case here,我尽力在这里用简单的用例来解释,
Vectorized code refers to operations that are performed on multiple components of a vector at the same time (in one statement)向量化代码是指同时对向量的多个分量执行的操作(在一个语句中)
import numpy as np
a = np.array([1,2,3,4,5])
b = np.array([1,1,1,1,1])
c = a+b
Refer to below code, operands are scalars not vectors, performed on one component of vector a and one component of vector b at a time参考下面的代码,操作数是标量而不是向量,一次对向量 a 的一个分量和向量 b 的一个分量执行
a = [1,2,3,4,5]
b = [1,1,1,1,1]
c = []
for a_, b_ in zip(a, b):
c.append(a_ + b_)
Swifter you can apply to data-frame, ref : https://github.com/jmcarpenter2/swifter更快,您可以应用于数据框,参考: https : //github.com/jmcarpenter2/swifter
df = pd.DataFrame({'x': [1, 2, 3, 4], 'y': [5, 6, 7, 8]})
df['agg'] = df.swifter.apply(lambda x: x.sum() - x.min())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.