简体   繁体   English

为什么 tf.matmul 不能与转置张量一起使用?

[英]Why doesn't `tf.matmul` work with transposed tensor?

Why doesn't tf.matmul work with transposed tensor?为什么tf.matmul用于转置张量?

transpose_b=True is ok, but not tf.transpose(inp) . transpose_b=True是可以的,但不是tf.transpose(inp)

This screenshot was made in Colab with tensorflow-gpu==2.0.0-rc1 :此屏幕截图是在 Colab 中使用tensorflow-gpu==2.0.0-rc1制作的:

在此处输入图像描述

transpose_b=True in tf.linalg.matmul transposes only the two last axes of the second given tensor, while tf.transpose , without more arguments, reverses the dimensions completely. tf.linalg.matmul中的transpose_b=True仅转置第二个给定张量的最后两个轴,而tf.transpose没有更多的 arguments 完全反转维度。 The equivalent would be:相当于:

inp_t = tf.transpose(inp, (0, 2, 1))
tf.matmul(inp, inp_t)

tf.transpose() performs regular 2-D matrix transpose by default (it sets the perm parameter to input_tensor_rank-1) if you don't explicitly specify the perm (permutation) parameter.如果您没有明确指定perm (permutation) 参数, tf.transpose()默认执行常规二维矩阵转置(它将 perm 参数设置为 input_tensor_rank-1)。 So set the perm parameter appropriately所以适当设置perm参数

inp_t = tf.transpose(inp, perm=[0,2,1])
y = tf.matmul(inp, x)
print(y)

What Tensorflow is telling you is that the dimensions do not match up when multiplying the two tensors together. Tensorflow 告诉您的是,将两个张量相乘时尺寸不匹配。 Think of it in basic linear algebra terms.用基本的线性代数术语来考虑它。 When multiplying matrices, you can only multiply together matrices, where the last dimension of the first matrix is the same as the first dimension of the second.矩阵相乘时,只能将矩阵相乘,其中第一个矩阵的最后一维与第二个矩阵的第一维相同。 Eg you can multiply a 2x4 Matrix with a 4x2 matrix (which is what transpose does for you. From the docs :例如,您可以将 2x4 矩阵与 4x2 矩阵相乘(这是transpose为您所做的。来自文档

If perm is not given, it is set to (n-1...0), where n is the rank of the input tensor.如果未给出perm ,则将其设置为 (n-1...0),其中 n 是输入张量的秩。 Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors.因此,默认情况下,此操作对二维输入张量执行常规矩阵转置。

so if you omit perm in higher dimensions, tf.transform() switches dimensions just like it would for 2d tensors (matrices):因此,如果您在更高维度上省略 perm, tf.transform()会像 2d 张量(矩阵)一样切换维度:

inp_t_without_perm = tf.transpose(inp)
inp_t_without_perm
# Output: <tf.Tensor 'transpose_8:0' shape=(1, 4, 2) dtype=float32>

so it just switches the last dimension for the first and leaves the second one unaltered.所以它只是切换第一个维度的最后一个维度,而第二个维度保持不变。 This is equivalent to:这相当于:

inp_t_with_wrong_perm = tf.transpose(inp, perm=[2,1,0])
inp_t_with_wrong_perm
# Output: <tf.Tensor 'transpose_8:0' shape=(1, 4, 2) dtype=float32>

if you then do:如果你这样做:

mul = tf.matmul(inp, inp_t_without_perm) # or with inp_t_with_wrong_perm

you get this error, because either your first two or last two dimensions do not match up.您收到此错误,因为您的前两个或后两个维度不匹配。

Now, when multiplying higher-order tensors together, you have to align the dimension which differ in the same way you would do in 2d (think about it as dividing up your tensor into matrices and vectors. In your case, you have a vector and a matrix... Sorry, I did not come up yet with a better metaphor, and when I find a quiet half-hour with pen and paper, I could make it more formal by using Einstein notation, but this is basically how it works...).现在,当将高阶张量相乘时,您必须以与 2d 中相同的方式对齐不同的维度(将其视为将张量划分为矩阵和向量。在您的情况下,您有一个向量和一个矩阵...对不起,我还没想出更好的比喻,当我找到一个安静的半小时用笔和纸,我可以用爱因斯坦符号使它更正式,但这基本上是它的工作原理...)。

For your case, what works is:对于您的情况,有效的是:

inp = tf.reshape(tf.linspace(-1.0, 1.0, 8), (2,4,1))
# switch the last two dimensions so you can multiply 4x1 by 1x4
# and leave first dimension as it is.
inp_t = tf.transpose(inp, perm=[0,2,1])
mul = tf.matmul(inp, inp_t)
mul
# Output: <tf.Tensor 'MatMul_8:0' shape=(2, 4, 4) dtype=float32>

Note that in your case, this is the only permutation which works, since this kind of multiplication is non-commutative.请注意,在您的情况下,这是唯一有效的排列,因为这种乘法是不可交换的。 So you will have to match up dimensions from left to right (again, sorry for the hand-waving, but a formal mathematical proof would require me to do some higher-order tensor algebra, but I think this is precisely what you want to achieve...).所以你必须从左到右匹配尺寸(再次,抱歉挥手,但正式的数学证明需要我做一些高阶张量代数,但我认为这正是你想要实现的...)。 I did not go too deep into the documentation, but I think the transform_b parameter is doing precicely this permutation for you.我没有 go 太深入到文档中,但我认为transform_b参数正在为你做这个排列。 Hope that helps.希望有帮助。 Please comment for further questions.请评论更多问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM