简体   繁体   English

tf.multiply vs tf.matmul计算点积

[英]tf.multiply vs tf.matmul to calculate the dot product

I have a matrix (of vectors) X with shape [3,4], and I want to calculate the dot product between each pair of vectors (X[1].X[1]) and (X[1].X[2])...etc. 我有一个矩阵(矢量)X形状[3,4],我想计算每对矢量(X [1] .X [1])和(X [1] .X [之间的点积] 2])...等。

I saw a cosine similarity code were they use 我看到他们使用的余弦相似代码

tf.reduce_sum(tf.multyply(X, X),axis=1) tf.reduce_sum(tf.multyply(X,X),axis = 1)

to calculate the dot product between the vectors in a matrix of vectors.However, this result in only calculates the dot product between (X[i], X[i]). 计算向量矩阵中向量之间的点积。然而,这个结果只计算(X [​​i],X [i])之间的点积。

I used tf.matmul(X, X, transpose_b=True) which calculate the dot product between every two vectors but I am still confused why tf.multiply didn't do this I think the problem with my code. 我使用了tf.matmul(X,X,transpose_b = True)来计算每两个向量之间的点积,但我仍然感到困惑,为什么tf.multiply没有这样做我认为我的代码存在问题。

the code is: 代码是:

data=[[1.0,2.0,4.0,5.0],[0.0,6.0,7.0,8.0],[8.0,1.0,1.0,1.0]]
X=tf.constant(data)
matResult=tf.matmul(X, X, transpose_b=True)

multiplyResult=tf.reduce_sum(tf.multiply(X,X),axis=1)
with tf.Session() as sess:
   print('matResult')
   print(sess.run([matResult]))
   print()
   print('multiplyResult')
   print(sess.run([multiplyResult]))

The output is: 输出是:

matResult
[array([[  46.,   80.,   19.],
       [  80.,  149.,   21.],
       [  19.,   21.,   67.]], dtype=float32)]

multiplyResult
 [array([  46.,  149.,   67.], dtype=float32)]

I would appreciate any advise 我很感激任何建议

tf.multiply(X, Y) does element-wise multiplication so that tf.multiply(X, Y) 元素乘法

[[1 2]    [[1 3]      [[1 6]
 [3 4]] .  [2 1]]  =   [6 4]]

wheras tf.matmul does matrix multiplication so that tf.matmul 矩阵乘法运算

[[1 0]    [[1 3]      [[1 3]
 [0 1]] .  [2 1]]  =   [2 1]]

using tf.matmul(X, X, transpose_b=True) means that you are calculating X . X^T 使用tf.matmul(X, X, transpose_b=True)表示您正在计算X . X^T X . X^T where ^T indicates the transposing of the matrix and . X . X^T其中^T表示矩阵的转置和. is the matrix multiplication. 是矩阵乘法。

tf.reduce_sum(_, axis=1) takes the sum along 1st axis (starting counting with 0) which means you are suming the rows: tf.reduce_sum(_, axis=1)取第一轴的总和(从0开始计数),这意味着你要对行进行求和:

tf.reduce_sum([[a b], [c, d]], axis=1) = [a+b, c+d]

This means that: 这意味着:

tf.reduce_sum(tf.multiply(X, X), axis=1) = [X[1].X[1], ..., X[n].X[n]]

so that is the one you want if you only want the norms of each rows. 如果您只想要每行的规范,那么这就是您想要的那个。 On the other hand 另一方面

 tf.matmul(X, X, transpose_b=True) = [[ X[1].X[1], X[1].X[2], ..., X[1].X[n]], 
                                       [X[2].X[1], ..., X[2].X[n]],
                                       ...
                                       [X[n].X[1], ..., X[n].X[n]]

so that is what you need if you want the similarity between all pairs of rows. 如果你想要所有行对之间的相似性,那么这就是你所需要的。

What tf.multiply(X, X) does is essentially multiplying each element of the matrix with itself, like tf.multiply(X, X)作用基本上是将矩阵的每个元素与自身相乘,就像

[[1 2]
 [3 4]]

would turn into 会变成

[[1 4]
 [9 16]]

whereas tf.reduce_sum(_, axis=1) takes a sum of each row, so the result for the previous example will be tf.reduce_sum(_, axis=1)取每行的总和,因此上一个例子的结果将是

[5 25]

which is exactly (by definition) equal to [X[0, :] @ X[0, :], X[1, :] @ X[1, :]] . 这正是(根据定义)等于[X[0, :] @ X[0, :], X[1, :] @ X[1, :]]

Just put it down with variable names [[ab] [cd]] instead of actual numbers and look at what does tf.matmul(X, X) and tf.multiply(X, X) do. 只需用变量名[[ab] [cd]]而不是实际数字来记下它,看看tf.matmul(X, X)tf.multiply(X, X)做了什么。

In short tf.multiply() does element wise product(dot product). 简而言之, tf.multiply()做元素明智的产品(点积)。 whereas tf.matmul() does actual matrix mutliplication. tf.matmul()进行实际的矩阵多重表达。 so tf.multiply() needs arguments of same shape so that element wise product is possible ie,shapes are (n,m) and (n,m) . 所以tf.multiply()需要相同形状的参数,以便元素明智的产品是可能的,即形状是(n,m)和(n,m) But tf.matmul() needs arguments of shape (n,m) and (m,p) so that resulting matrix is (n,p) [ usual math ]. 但是tf.matmul()需要形状(n,m)和(m,p)的参数,因此得到的矩阵是(n,p)[通常的数学]。

Once understood, this can be applied to Multi-Dimensional matrices easily. 一旦理解,这可以很容易地应用于多维矩阵。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM