PyTorch 在行为方面等同于 Tensorflow 的 tf.keras.dot()？

Question

In tensorflow, if you have 2 tensors of shape NxTxD and NxDxT respectively (N=batch_size, T=SequenceLength, D=NumberOfFeatures), you can dot them and get an output of NxTxT, as demonstrated below:在 tensorflow 中，如果你有 2 个形状分别为 NxTxD 和 NxDxT 的张量（N=batch_size，T=SequenceLength，D=NumberOfFeatures），你可以点它们并得到 NxTxT 的 output，如下所示：

import tensorflow as tf
import numpy as np

x1 = np.arange(2 * 4 * 3).reshape(2, 4, 3)
x2 = np.flip(np.arange(2 * 4 * 3).reshape(2, 3, 4), 1).copy()
print(x1.shape, x2.shape)
dotted = tf.keras.layers.Dot(axes=(2, 1))([x1, x2])
print(dotted.shape)
dotted

(2, 4, 3) (2, 3, 4)
(2, 4, 4)
<tf.Tensor: shape=(2, 4, 4), dtype=int32, numpy=
array([[[   4,    7,   10,   13],
        [  40,   52,   64,   76],
        [  76,   97,  118,  139],
        [ 112,  142,  172,  202]],

       [[ 616,  655,  694,  733],
        [ 760,  808,  856,  904],
        [ 904,  961, 1018, 1075],
        [1048, 1114, 1180, 1246]]])>

If you try to do the same in PyTorch, the result is different:如果你尝试在 PyTorch 中做同样的事情，结果会有所不同：

import torch
import numpy as np

x1 = torch.from_numpy(np.arange(2 * 4 * 3).reshape(2, 4, 3))
x2 = torch.from_numpy(np.flip(np.arange(2 * 4 * 3).reshape(2, 3, 4), 1).copy())
dotted = torch.tensordot(x1, x2, dims=([2], [1]))
print(x1.shape, x2.shape)
print(dotted.shape)
dotted

torch.Size([2, 4, 3]) torch.Size([2, 3, 4])
torch.Size([2, 4, 2, 4])
tensor([[[[   4,    7,   10,   13],
          [  40,   43,   46,   49]],

         [[  40,   52,   64,   76],
          [ 184,  196,  208,  220]],

         [[  76,   97,  118,  139],
          [ 328,  349,  370,  391]],

         [[ 112,  142,  172,  202],
          [ 472,  502,  532,  562]]],


        [[[ 148,  187,  226,  265],
          [ 616,  655,  694,  733]],

         [[ 184,  232,  280,  328],
          [ 760,  808,  856,  904]],

         [[ 220,  277,  334,  391],
          [ 904,  961, 1018, 1075]],

         [[ 256,  322,  388,  454],
          [1048, 1114, 1180, 1246]]]], dtype=torch.int32)

Now, Tensorflow's results exist inside the results that pytorch produces (it's a subset of it).现在，Tensorflow 的结果存在于 pytorch 产生的结果中（它是它的一个子集）。 In fact, tensorflow's results is basically some kind of "diagonal" in higher dimensions.事实上，tensorflow 的结果在更高维度上基本上是某种“对角线”。 PyTorch's output is NxTxNxT, so to get exactly the same results as Tensorflow you can do: PyTorch 的 output 是 NxTxNxT，因此要获得与 Tensorflow 完全相同的结果，您可以这样做：

torch.stack([dotted[i, :, i, :] for i in range(len(dotted))])

tensor([[[   4,    7,   10,   13],
         [  40,   52,   64,   76],
         [  76,   97,  118,  139],
         [ 112,  142,  172,  202]],

        [[ 616,  655,  694,  733],
         [ 760,  808,  856,  904],
         [ 904,  961, 1018, 1075],
         [1048, 1114, 1180, 1246]]], dtype=torch.int32)

but this doesn't negate the fact that you're both:但这并不能否定你们都是这样的事实：

Allocating memory for a tensor of NxTxNxT instead of NxTxT为张量 NxTxNxT 而不是 NxTxT 分配 memory
The computational complexity/time increases dramatically计算复杂度/时间急剧增加

Is there a way to get the same 3 dimensional results that tensorflow gives from pytorch, without it computing the 4 dimensional tensor?有没有办法在不计算 4 维张量的情况下获得与 tensorflow 从 pytorch 给出的相同的 3 维结果？

Answer 1

I hope you are looking for batch matrix multiplication (bmm) which multiplies two batches of matrices - the two tensors have to be 3D.我希望您正在寻找将两批矩阵相乘的批量矩阵乘法 (bmm) - 这两个张量必须是 3D。 https://pytorch.org/docs/stable/generated/torch.bmm.html https://pytorch.org/docs/stable/generated/torch.bmm.html

PyTorch 在行为方面等同于 Tensorflow 的 tf.keras.dot()？

问题描述

1 个解决方案

解决方案1
0 2021-11-26 20:48:55

PyTorch 在行为方面等同于 Tensorflow 的 tf.keras.dot()？

问题描述

1 个解决方案

解决方案1 0 2021-11-26 20:48:55

解决方案1
0 2021-11-26 20:48:55