[英]How do i interpret the weights, in this branched tf model?
我沒有找到一個合適的問題來回答這個場景,如果它已經在其他地方得到了回答,請隨時指出我的問題。
我有一個 tensorflow model (在問題的最后,我有一個最小的可重現代碼,任何人都可以測試它。)這些規格
我無法理解為什么新分支中卷積層的權重矩陣仍然具有形狀 ((3, .. .. ))。
我的 Model:
# Declaring a couple of layers
Conv_1 = tf.keras.layers.Conv1D(
75,
activation='swish',
kernel_size=3,
strides=2,
padding='valid',
name='Conv_1',
# data_format = 'channels_first',
)(inputs)
Av_P_1 = tf.keras.layers.AveragePooling2D(
pool_size=(1,3),
strides=(1,2),
padding='valid',
data_format='channels_first',
name='Av_P_1'
)(Conv_1)
Layer_N_1 = tf.keras.layers.LayerNormalization(
name='Layer_N_1'
)(Av_P_1)
Dense_1 = tf.keras.layers.Dense(
70,
activation='swish',
name='Dense_1'
)(Layer_N_1)
Layer_N_2 = tf.keras.layers.LayerNormalization(
name='Layer_N_2'
)(Dense_1)
# Ustacking here
s,t,r = tf.unstack(
Layer_N_2,
axis=1
)
# Branching here
Conv_2 = tf.keras.layers.Conv1D(50,
activation='swish',
kernel_size=3,
strides=2,
padding='valid',
name='Conv_2',
)(tf.concat([s], axis = 1)) # <-------- Branching here
Av_P_2 = tf.keras.layers.AveragePooling1D(
pool_size=3,
strides=1,
padding='valid',
name = 'Av_P_2'
)(Conv_2)
Layer_N_3 = tf.keras.layers.LayerNormalization(
name='Layer_N_3'
)(Av_P_2)
LSTM_1 = tf.keras.layers.LSTM(35,
return_sequences=True,
name='LSTM_1'
)(Layer_N_3)
test_model = tf.keras.Model(inputs=inputs, outputs=[LSTM_1])
print(test_model.summary())
Model: "model_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Input (InputLayer) [(None, 3, 47, 12)] 0
_________________________________________________________________
Conv_1 (Conv1D) (None, 3, 23, 75) 2775
_________________________________________________________________
Av_P_1 (AveragePooling2D) (None, 3, 23, 37) 0
_________________________________________________________________
Layer_N_1 (LayerNormalizatio (None, 3, 23, 37) 74
_________________________________________________________________
Dense_1 (Dense) (None, 3, 23, 70) 2660
_________________________________________________________________
Layer_N_2 (LayerNormalizatio (None, 3, 23, 70) 140
_________________________________________________________________
tf.unstack_19 (TFOpLambda) [(None, 23, 70), (None, 2 0
_________________________________________________________________
tf.identity_4 (TFOpLambda) (None, 23, 70) 0
_________________________________________________________________
Conv_2 (Conv1D) (None, 11, 50) 10550
_________________________________________________________________
Av_P_2 (AveragePooling1D) (None, 9, 50) 0
_________________________________________________________________
Layer_N_3 (LayerNormalizatio (None, 9, 50) 100
_________________________________________________________________
LSTM_1 (LSTM) (None, 9, 35) 12040
=================================================================
Total params: 28,339
Trainable params: 28,339
Non-trainable params: 0
_________________________________________________________________
unstack 后每個元素 (s,t,r) 的形狀:
(TensorShape([None, 23, 70]),
TensorShape([None, 23, 70]),
TensorShape([None, 23, 70]))
僅用於 conv 層的權重和偏差的形狀:
###### Name: ######
Layer Name: Conv_1
Weights:
Conv_1/kernel:0
(3, 12, 75)
Biases:
Conv_1/bias:0
(75,)
###### Name: ######
Layer Name: Conv_2
Weights:
Conv_2/kernel:0
(3, 70, 50) < ---- why does the weight matrix still have 3 channels
Biases:
Conv_2/bias:0
(50,)
請隨時提出任何進一步的問題,因為我知道我可能對每個人都不夠清楚。
PS 我知道如何使用 channels_first 和 channels_last。
因此,按照這個答案,權重矩陣中的 (3, .., ..) 不對應於通道數,而是對應於濾波器大小。 在上面的示例中,上面的 conv1D 層中使用的過濾器大小是3 ,如果過濾器大小更改為5例如,權重矩陣將如下所示。
###### Name: ######
Layer Name: Conv_1
Weights:
Conv_1/kernel:0
(5, 12, 75)
Biases:
Conv_1/bias:0
(75,)
###### Name: ######
Layer Name: Conv_2
Weights:
Conv_2/kernel:0
(5, 70, 50)
Biases:
Conv_2/bias:0
(50,)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.