简体繁体中英

Can pytorch optimize sequential operations (like a tensorflow graph or JAX's jit)?

原文 2019-10-28 18:18:30 2 1 python/ pytorch/ jax

Originally, tensorflow and pytorch had a fundamental difference:

tensorflow is based on a computional graph. Building this graph and evaluating it in a session are two separate steps. While it is being used, the graph doesn't change, which allows for optimizations.
torch eagerly evaluates operations on a tensor. This makes the API more convenient (no sessions) but also looses the potential to recognize and optimize operations that always occur in sequence.

Now this difference is becoming less clear. Tensorflow has answered to the popularity of torch with tf eager . There is also the JAX project, which builds on the same underlying framework as tensorflow ( XLA ). JAX has no concept of a session. But it allows you to compile multiple operations together by simply calling jit .

Since Tensorflow has moved to cover PyTorch functionality, is PyTorch also working on integrating Tensorflow advantages? Is there something like a session or jit functionality in PyTorch (or on its roadmap)?

The API docs have a jit section , but as far as I can see, that is more about exporting your models.

1 answers

As you mentioned, there is a torch.jit and it's purpose is also to introduce optimization in the exported graph (eg kernel fusion, optimization of constants etc.). IIRC you can find some source code regarding those in their github repo here , though I'm not sure whether those are explicitly mentioned somewhere in the docs (or explicitly enough to be remembered).

Since 1.3 there is also quantization introduced (see here for some introduction). In tutorials section, namely here you can see explicit fusion of Conv2d , BatchNorm and ReLU in order to improve performance. Ofc there also exists specific stuff like using int instead of float for weights (quantization), mixed arithmetic (using half float precision whenever possible, see NVidia's Apex ) and others.

Last but not least, I don't think for a well written model using vectorized operations and exported with torchscript you are gonna see really substantial runtime differences because of some generic graph optimization. Still it differs whether you are going to use GPU, CPU, TPU, what are their versions, whether you are after inference only or training as well etc. It's pretty hard to pinpoint how fast tensorflow is in comparison to pytorch (besided some well-known issues in both frameworks). All in all it depends and measurements vary a lot AFAIK.

BTW. When it comes to advantages of each framework their core indeed starts to cover similar things (PyTorch got mobile support lately, see here ). Real difference is still different underlying approach and what each framework has to do to circumvent those limitations.

Jax, jit and dynamic shapes: a regression from Tensorflow?

About custom operations in Tensorflow and PyTorch

Problems with Jax's JIT and Numpy restrictions

How to combine sequential operations with side effects in TensorFlow

Tensorflow: unsupported operand type(s) for -: 'Sequential' and 'Sequential'

Tensorflow 2.0: How can I fully customize a Tensorflow training loop like I can with PyTorch?

Optimize a graph built by two imported graphs Tensorflow

How to assign a tensor in Tensorflow like Pytorch?

Avoid cluttering the tensorflow graph with assign operations

tensorflow autodiff slower than pytorch's counterpart

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Jax, jit and dynamic shapes: a regression from Tensorflow? About custom operations in Tensorflow and PyTorch Problems with Jax's JIT and Numpy restrictions How to combine sequential operations with side effects in TensorFlow Tensorflow: unsupported operand type(s) for -: 'Sequential' and 'Sequential' Tensorflow 2.0: How can I fully customize a Tensorflow training loop like I can with PyTorch? Optimize a graph built by two imported graphs Tensorflow How to assign a tensor in Tensorflow like Pytorch? Avoid cluttering the tensorflow graph with assign operations tensorflow autodiff slower than pytorch's counterpart

Related Tags

Can pytorch optimize sequential operations (like a tensorflow graph or JAX's jit)?

Question

1 answers

solution1 3 2019-10-28 18:44:45

solution1
3 2019-10-28 18:44:45