如何創建可以傳遞兩個輸入和 tf.matmul 的自定義層

Question

我自己模型的代碼是

class KeyQuery(keras.layers.Layer):
    def __init__(self, v):
        super(KeyQuery, self).__init__()
        self.v = tf.convert_to_tensor(v)
        
    def build(self, input_shape): 
        self.v = tf.Variable(self.v, trainable = True)
        print(self.v.shape)
    def call(self, inputs1, inputs2):
        y1 = tf.matmul(self.v, tf.transpose(inputs1))
        y2 = tf.matmul(y2, inputs2)
        return y2
    
keyquery = KeyQuery(v)

inputs1 = keras.Input(shape=(50,768))
inputs2 = keras.Input(shape=(50,3))
outputs = keyquery(inputs1,inputs2)
model = keras.Model([inputs1,inputs2], outputs)
model.summary()

其中v for keyquery = KeyQuery(v)是一個大小為 (1,768) 的二維數組，也可以看作是一個向量。

我的理想情況是，在y1 = tf.matmul(self.v, tf.transpose(inputs1))中，因為self.v.shape是 (1,768)， inputs1的形狀是 (50,768)，所以y1的形狀應該是(1, 50)。 input2 的形狀是 ( inputs2 )，所以y2的形狀應該是 (1, 3)。

因此，考慮到批量維度，當inputs1.shape為 (None, 50, 768) 且inputs2.shape為 (None, 50, 3) 時，它應該返回 shape (None, 1, 3) 的結果。 請注意keras.Input不需要批量維度。

但在實際情況下它返回ValueError: Dimensions must be equal, but are 768 and 50 for '{{node key_query_4/MatMul}} = BatchMatMulV2[T=DT_FLOAT, adj_x=false, adj_y=false](key_query_4/MatMul/ReadVariableOp, key_query_4/transpose)' with input shapes: [1,768], [768,50,?]. 因為批量維度。 我不知道如何為我的矩陣乘法解決這個問題。

Answer 1

你需要考慮：

在傳遞給build_method的值中，我們有一個張量的形狀，在call_method我們有張量的值。
您需要使用tf.transpose()中的perm來修復批次的維度並交換其他維度。

代碼：

import tensorflow as tf
import numpy as np

class KeyQuery(tf.keras.layers.Layer):
    def __init__(self, v):
        super(KeyQuery, self).__init__()
        self.v = tf.convert_to_tensor(v, dtype='float32')
        
    def build(self, input_shape): # here we have shape of input_tensor
        self.v = tf.Variable(self.v, trainable = True)

    def call(self, inputs): # here we have value of input_tensor
        y1 = tf.matmul(self.v, tf.transpose(inputs[0], perm=[0,2,1]))
        y2 = tf.matmul(y1, inputs[1])
        return y2
    
keyquery = KeyQuery(np.random.rand(1,768))
out = keyquery((tf.random.uniform((25, 50, 768)), tf.random.uniform((25, 50, 3))))
print(out.shape)
# (25, 1, 3)

# or with model
inputs1 = tf.keras.Input(shape=(50,768))
inputs2 = tf.keras.Input(shape=(50,3))
outputs = keyquery((inputs1,inputs2))
model = tf.keras.Model([inputs1,inputs2], outputs)
model.summary()

輸出：

Model: "model_2"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_13 (InputLayer)          [(None, 50, 768)]    0           []                               
                                                                                                  
 input_14 (InputLayer)          [(None, 50, 3)]      0           []                               
                                                                                                  
 key_query_27 (KeyQuery)        (None, 1, 3)         768         ['input_13[0][0]',               
                                                                  'input_14[0][0]']               
                                                                                                  
==================================================================================================
Total params: 768
Trainable params: 768
Non-trainable params: 0
__________________________________________________________________________________________________

如何創建可以傳遞兩個輸入和 tf.matmul 的自定義層

問題描述

1 個解決方案

解決方案1
1 已采納 2022-06-18 17:12:17

如何創建可以傳遞兩個輸入和 tf.matmul 的自定義層

問題描述

1 個解決方案

解決方案1 1 已采納 2022-06-18 17:12:17

解決方案1
1 已采納 2022-06-18 17:12:17