How to matmul a 2d tensor with a 3d tensor in tensorflow?

Question

In numpy you can multiply a 2d array with a 3d array as below example:

>>> X = np.random.randn(3,5,4) # [3,5,4]
... W = np.random.randn(5,5) # [5,5]
... out = np.matmul(W, X) # [3,5,4]

from my understanding, np.matmul() takes W and broadcast it along the first dimension of X . But in tensorflow it is not allowed:

>>> _X = tf.constant(X)
... _W = tf.constant(W)
... _out = tf.matmul(_W, _X)

ValueError: Shape must be rank 2 but is rank 3 for 'MatMul_1' (op: 'MatMul') with input shapes: [5,5], [3,5,4].

So is there a equivalent for what np.matmul() does above in tensorflow ? And what's the best practice in tensorflow for multiplying 2d tensor with 3d tensor?

Answer 1

Try using tf.tile to match the dimension of the matrix before multiplication. The automatic broadcast feature of numpy doesnt seem to be implemented in tensorflow. You have to do it manually.

W_T = tf.tile(tf.expand_dims(W,0),[3,1,1])

This should do the trick

import numpy as np
import tensorflow as tf

X = np.random.randn(3,4,5)
W = np.random.randn(5,5)

_X = tf.constant(X)
_W = tf.constant(W)
_W_t = tf.tile(tf.expand_dims(_W,0),[3,1,1])

with tf.Session() as sess:
    print(sess.run(tf.matmul(_X,_W_t)))

Answer 2

您可以改用tensordot ：

tf.transpose(tf.tensordot(_W, _X, axes=[[1],[1]]),[1,0,2])

Answer 3

Following is from tensorflow XLA broadcasting semantics

The XLA language is as strict and explicit as possible, avoiding implicit and "magical" features. Such features may make some computations slightly easier to define, at the cost of more assumptions baked into user code that will be difficult to change in the long term.

So Tensorflow doesn't offers built in broadcasting feature.

However it does offer something that can reshape a tensor just like it was broadcasted. This operation is called tf.tile

Signature is as follow :

tf.tile(input, multiples, name=None)

This operation creates a new tensor by replicating input multiples times. The output tensor's i'th dimension has input.dims(i) * multiples[i] elements, and the values of input are replicated multiples[i] times along the 'i'th dimension.

Answer 4

You can also use tf.einsum to avoid tiling the tensor:

tf.einsum("ab,ibc->iac", _W, _X)

A full example:

import numpy as np
import tensorflow as tf

# Numpy-style matrix multiplication:
X = np.random.randn(3,5,4)
W = np.random.randn(5,5)
np_WX = np.matmul(W, X)

# TensorFlow-style multiplication:
_X = tf.constant(X)
_W = tf.constant(W)
_WX = tf.einsum("ab,ibc->iac", _W, _X)

with tf.Session() as sess:
    tf_WX = sess.run(_WX)

# Check that the results are the same:
print(np.allclose(np_WX, tf_WX))

Answer 5

Here I'll use keras backend K.dot and tensorflow tf.transpose . first swap inner dim of 3 D tensor

X=tf.transpose(X,perm=[0,-1,1]) # X shape=[3,4,5]

now multiply like so

out=K.dot(X,W) # out shape=[3,4,5]

and now swap axes again

out = tf.transpose(out,perm=[0,-1,1]) # out shape=[3,5,4]

Above solution saves memory at little cost of time because you are not tiling W .

How to matmul a 2d tensor with a 3d tensor in tensorflow?

Question

5 answers

solution1
5 ACCPTED 2018-07-03 04:52:40

solution2
5 2018-07-03 05:29:10

solution3
3 2018-07-03 05:20:17

solution4
2 2019-06-08 20:04:02

solution5
0 2019-08-30 15:56:58

How to matmul a 2d tensor with a 3d tensor in tensorflow?

Question

5 answers

solution1 5 ACCPTED 2018-07-03 04:52:40

solution2 5 2018-07-03 05:29:10

solution3 3 2018-07-03 05:20:17

solution4 2 2019-06-08 20:04:02

solution5 0 2019-08-30 15:56:58

solution1
5 ACCPTED 2018-07-03 04:52:40

solution2
5 2018-07-03 05:29:10

solution3
3 2018-07-03 05:20:17

solution4
2 2019-06-08 20:04:02

solution5
0 2019-08-30 15:56:58