Swifter：矢量化和非矢量化函数的区别是什么？

Question

I need to learn about pandas speed optimization.我需要了解熊猫速度优化。 Library that very effective about my problem is swifter.对我的问题非常有效的图书馆更快。 But i don't understand about the documentation, especially vectorized function.但我不了解文档，尤其是矢量化函数。

My assumption is swifter input is only accept vector input, not dataframe.我的假设是更快的输入只接受向量输入，而不是数据帧。 is it wrong?这是错的吗？

in the documentation this is vectorized function:在文档中，这是矢量化函数：

def bikes_proportion(x, max_x):
    return x * 1.0 / max_x

and this is non-vectorized function:这是非矢量化函数：

def convert_to_human(datetime):
    return datetime.weekday_name + ', the ' + str(datetime.day) + 'th day of ' + datetime.strftime("%B") + ', ' + str(datetime.year)

what is the difference?有什么不同？

Can you tell me what is the different about vectorized and non-vectorized function?你能告诉我矢量化和非矢量化函数有什么不同吗？ and if you ever use swifter before.如果您以前使用过 swifter。 can swifter work with dataframe or it only work with vector?可以更快地使用数据框还是只能使用矢量？

Answer 1

I am trying my best to explain with simple use case here,我尽力在这里用简单的用例来解释，

Vectorized code refers to operations that are performed on multiple components of a vector at the same time (in one statement)向量化代码是指同时对向量的多个分量执行的操作（在一个语句中）

import numpy as np

a = np.array([1,2,3,4,5])
b = np.array([1,1,1,1,1])
c = a+b

Refer to below code, operands are scalars not vectors, performed on one component of vector a and one component of vector b at a time参考下面的代码，操作数是标量而不是向量，一次对向量 a 的一个分量和向量 b 的一个分量执行

a = [1,2,3,4,5]
b = [1,1,1,1,1]
c = []
for a_, b_ in zip(a, b):
    c.append(a_ + b_)

Swifter you can apply to data-frame, ref : https://github.com/jmcarpenter2/swifter更快，您可以应用于数据框，参考： https : //github.com/jmcarpenter2/swifter

df = pd.DataFrame({'x': [1, 2, 3, 4], 'y': [5, 6, 7, 8]})
df['agg'] = df.swifter.apply(lambda x: x.sum() - x.min())

Swifter：矢量化和非矢量化函数的区别是什么？

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-04-01 09:50:25

Swifter：矢量化和非矢量化函数的区别是什么？

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-04-01 09:50:25

解决方案1
2 已采纳 2019-04-01 09:50:25