简体繁体中英

Understanding numpy's vectorization of loops

原文 2018-12-20 16:51:26 6 1 python/ numpy/ vectorization/ simd

I want to verify that I've understood the concept of vectorized code that is mentioned in many Machine Learning lectures/notes/videos.

I did some reading on this and found that CPU's and GPU's have an instruction set called SIMD; single instruction multiple data.

This works for example by moving two variables to two special 64/128 bit registers, then adding all the bits at once.

I've also read that with most modern compilers like GCC for example, if you turn on optimization with the -Ofast flag, which is

-Ofast - Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fno-protect-parens and -fstack-arrays.

The -Ofast should then auto-vectorize any loops written in C/C++ when possible to SIMD instructions.

I tested this out on my own code and was able to get a significant speedup on MNIST dataset from 45 minutes down to 5 minutes.

I am also aware that numpy is written in C and wrapped with PyObjects. I read through a lot of their code but it is difficult.

My question is then: is my understanding above correct, and does Numpy also do the same thing, or do they use explicit pragmas or other special instruction/register names for their vectorization?

1 answers

numpy doesn't do anything like that.

The term vectorization in numpy context means that you make numpy work on your array directly rather than making a loop yourself. It is usually then passed to what is call "universal functions", or "ufunc" for short. These functions are C functions that will process in C in a C for loop the operation that is intended.

But it usually cannot do any ISA vectorization. The reason is that these functions are universal for all types of arrays, dense or views on these dense arrays. As such, due to the pattern that is used, you cannot expect vectorization.

If you want ISA vectorized numpy calls, you can use numba which JIT can JIT (and thus really ISA vectorize). There is another project that would use one of Intel's libraries, but I can't find it anymore.

numpy vectorization instead of for loops

Understanding numpy vectorization

Vectorization - Adding numpy arrays without loops?

More Numpy Vectorization instead of using nested loops

Struggling with Understanding Numpy Vectorization for my use case

Numpy vectorization with calculation depending on previous value(s)

Requesting NumPy/SciPy vectorization replacements of for loops and list comprehensions

Numpy Vectorization

Vectorization with numpy

Understanding NumPy's Convolve

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question numpy vectorization instead of for loops Understanding numpy vectorization Vectorization - Adding numpy arrays without loops? More Numpy Vectorization instead of using nested loops Struggling with Understanding Numpy Vectorization for my use case Numpy vectorization with calculation depending on previous value(s) Requesting NumPy/SciPy vectorization replacements of for loops and list comprehensions Numpy Vectorization Vectorization with numpy Understanding NumPy's Convolve

Related Tags

Understanding numpy's vectorization of loops

Question

1 answers

solution1 1 ACCPTED 2018-12-20 17:03:38

solution1
1 ACCPTED 2018-12-20 17:03:38