简体   繁体   English

PyTorch 在行为方面等同于 Tensorflow 的 tf.keras.dot()?

[英]PyTorch equivalent to Tensorflow's tf.keras.dot() in terms of behaviour?

In tensorflow, if you have 2 tensors of shape NxTxD and NxDxT respectively (N=batch_size, T=SequenceLength, D=NumberOfFeatures), you can dot them and get an output of NxTxT, as demonstrated below:在 tensorflow 中,如果你有 2 个形状分别为 NxTxD 和 NxDxT 的张量(N=batch_size,T=SequenceLength,D=NumberOfFeatures),你可以点它们并得到 NxTxT 的 output,如下所示:

import tensorflow as tf
import numpy as np

x1 = np.arange(2 * 4 * 3).reshape(2, 4, 3)
x2 = np.flip(np.arange(2 * 4 * 3).reshape(2, 3, 4), 1).copy()
print(x1.shape, x2.shape)
dotted = tf.keras.layers.Dot(axes=(2, 1))([x1, x2])
print(dotted.shape)
dotted
(2, 4, 3) (2, 3, 4)
(2, 4, 4)
<tf.Tensor: shape=(2, 4, 4), dtype=int32, numpy=
array([[[   4,    7,   10,   13],
        [  40,   52,   64,   76],
        [  76,   97,  118,  139],
        [ 112,  142,  172,  202]],

       [[ 616,  655,  694,  733],
        [ 760,  808,  856,  904],
        [ 904,  961, 1018, 1075],
        [1048, 1114, 1180, 1246]]])>

If you try to do the same in PyTorch, the result is different:如果你尝试在 PyTorch 中做同样的事情,结果会有所不同:

import torch
import numpy as np

x1 = torch.from_numpy(np.arange(2 * 4 * 3).reshape(2, 4, 3))
x2 = torch.from_numpy(np.flip(np.arange(2 * 4 * 3).reshape(2, 3, 4), 1).copy())
dotted = torch.tensordot(x1, x2, dims=([2], [1]))
print(x1.shape, x2.shape)
print(dotted.shape)
dotted
torch.Size([2, 4, 3]) torch.Size([2, 3, 4])
torch.Size([2, 4, 2, 4])
tensor([[[[   4,    7,   10,   13],
          [  40,   43,   46,   49]],

         [[  40,   52,   64,   76],
          [ 184,  196,  208,  220]],

         [[  76,   97,  118,  139],
          [ 328,  349,  370,  391]],

         [[ 112,  142,  172,  202],
          [ 472,  502,  532,  562]]],


        [[[ 148,  187,  226,  265],
          [ 616,  655,  694,  733]],

         [[ 184,  232,  280,  328],
          [ 760,  808,  856,  904]],

         [[ 220,  277,  334,  391],
          [ 904,  961, 1018, 1075]],

         [[ 256,  322,  388,  454],
          [1048, 1114, 1180, 1246]]]], dtype=torch.int32)

Now, Tensorflow's results exist inside the results that pytorch produces (it's a subset of it).现在,Tensorflow 的结果存在于 pytorch 产生的结果中(它是它的一个子集)。 In fact, tensorflow's results is basically some kind of "diagonal" in higher dimensions.事实上,tensorflow 的结果在更高维度上基本上是某种“对角线”。 PyTorch's output is NxTxNxT, so to get exactly the same results as Tensorflow you can do: PyTorch 的 output 是 NxTxNxT,因此要获得与 Tensorflow 完全相同的结果,您可以这样做:

torch.stack([dotted[i, :, i, :] for i in range(len(dotted))])
tensor([[[   4,    7,   10,   13],
         [  40,   52,   64,   76],
         [  76,   97,  118,  139],
         [ 112,  142,  172,  202]],

        [[ 616,  655,  694,  733],
         [ 760,  808,  856,  904],
         [ 904,  961, 1018, 1075],
         [1048, 1114, 1180, 1246]]], dtype=torch.int32)

but this doesn't negate the fact that you're both:但这并不能否定你们都是这样的事实:

  1. Allocating memory for a tensor of NxTxNxT instead of NxTxT为张量 NxTxNxT 而不是 NxTxT 分配 memory
  2. The computational complexity/time increases dramatically计算复杂度/时间急剧增加

Is there a way to get the same 3 dimensional results that tensorflow gives from pytorch, without it computing the 4 dimensional tensor?有没有办法在不计算 4 维张量的情况下获得与 tensorflow 从 pytorch 给出的相同的 3 维结果?

I hope you are looking for batch matrix multiplication (bmm) which multiplies two batches of matrices - the two tensors have to be 3D.我希望您正在寻找将两批矩阵相乘的批量矩阵乘法 (bmm) - 这两个张量必须是 3D。 https://pytorch.org/docs/stable/generated/torch.bmm.html https://pytorch.org/docs/stable/generated/torch.bmm.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM