ValueError：形狀（784,32）和（10,784）未對齊：32（dim 1）！= 10（dim 0）對於神經網絡

Question

我正在嘗試從頭開始構建一個類似於 Keras 的簡單神經網絡庫，但我在讓訓練正常工作時遇到了問題。 自從我從頭開始編寫神經網絡而不是使用庫已經有一段時間了，所以我認為這是一個很好的做法。

我不太確定我是否為沒有給出輸入形狀的情況正確設置了構造函數，並且無論我通過層的神經元數量如何，我都會遇到“ValueError：形狀 X 和 Y 未對齊”問題或輸入形狀。 這是回溯：

Traceback (most recent call last):  File "NNfromScratch.py", line 551, in <module>
    model.train(X_train, y_train, epochs=100, batch_size=10, verbose=True)
  File "NNfromScratch.py", line 427, in train
    self.forward(batch_inputs)
  File "NNfromScratch.py", line 395, in forward
    self.outputs = layer.forward(self.outputs)
  File "NNfromScratch.py", line 153, in forward
    **self.outputs = np.dot(self.weights.T, inputs) + self.biases**
  File "<__array_function__ internals>", line 6, in dot
ValueError: shapes (784,32) and (10,784) not aligned: 32 (dim 1) != 10 (dim 0)

該錯誤是從Dense層的forward函數拋出的。

完整的（可重現的）代碼可以在這里看到。

不過，這是最重要部分的片段：

import time
import numpy as np
import pandas as pd
import pickle as pkl
import matplotlib.pyplot as plt
import tensorflow.keras.datasets.mnist as mnist

...

class Layers:
    class Dense:
        def __init__(self, neurons=0, activation=Activations.ReLU, inputs=0, dropout_rate=1):
            # Initialize weights and biases
            self.weights = np.random.randn(neurons, inputs)
            self.biases = np.random.randn(1, neurons)
            self.activation = activation
            self.dropout_rate = dropout_rate
        
        # Forward-Propagation
        def forward(self, inputs):
            self.inputs = inputs
            self.outputs = np.dot(self.weights.T, inputs) + self.biases
            self.outputs = self.activation(self.outputs)
            self.outputs = self.dropout(self.outputs)
            return self.outputs
        
        # Backward-Propagation
        def backward(self, error, learning_rate):
            self.error = error
            self.delta = self.error * self.activation(self.outputs)
            self.delta = self.dropout(self.delta, derivative=True)
            self.weights -= learning_rate * np.dot(self.delta, self.inputs.T)
            self.biases -= learning_rate * np.sum(self.delta, axis=0, keepdims=True)
            return self.delta
        
        # Dropout
        def dropout(self, x, derivative=False):
            if derivative:
                return self.dropout_rate * (1 - self.dropout_rate) * x
            return self.dropout_rate * x


class NeuralNetwork:
    """..."""

    
    def forward(self, inputs):
        # Forward-Propagation
        self.inputs = inputs
        self.outputs = self.inputs
        for layer in self.layers:
            self.outputs = layer.forward(self.outputs)
        return self.outputs
    
    def backward(self, targets):
        # Backward-Propagation
        self.targets = targets
        self.error = self.loss(self.outputs, self.targets)
        self.delta = self.error
        for layer in reversed(self.layers):
            self.delta = layer.backward(self.delta, self.optimizer_kwargs)
        return self.delta
    
    def update_weights(self):
        # Update weights and biases
        for layer in self.layers:
            layer.update_weights(self.optimizer_kwargs)
    
    def train(self, inputs, targets, epochs=1, batch_size=1, verbose=False):
        self.epochs = epochs
        self.epoch_errors = []
        self.epoch_losses = []
        self.epoch_accuracies = []
        self.epoch_times = []
        start = time.time()
        for epoch in range(self.epochs):
            epoch_start = time.time()
            epoch_error = 0
            epoch_loss = 0
            epoch_accuracy = 0
            for i in range(0, inputs.shape[0], batch_size):
                batch_inputs = inputs[i:i+batch_size]
                batch_targets = targets[i:i+batch_size]
                self.forward(batch_inputs)
                self.backward(batch_targets)
                self.update_weights()
                epoch_error += self.error.sum()
                epoch_loss += self.loss(self.outputs, self.targets).sum()
                epoch_accuracy += self.accuracy(self.outputs, self.targets)
            epoch_time = time.time() - epoch_start
            self.epoch_errors.append(epoch_error)
            self.epoch_losses.append(epoch_loss)
            self.epoch_accuracies.append(epoch_accuracy)
            self.epoch_times.append(epoch_time)
            if verbose:
                print('Epoch: {}, Error: {}, Loss: {}, Accuracy: {}, Time: {}'.format(epoch, epoch_error, epoch_loss, epoch_accuracy, epoch_time))
        self.train_time = time.time() - start
        return self.epoch_errors, self.epoch_losses, self.epoch_accuracies, self.epoch_times



# Load and flatten data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape((X_train.shape[0], -1))
X_test = X_test.reshape((X_test.shape[0], -1))
# Build model
model = NeuralNetwork([
    Layers.Dense(32, Activations.ReLU, inputs=X_train.shape[1]),
    Layers.Dense(10, Activations.ReLU),
    Layers.Dense(1, Activations.Softmax)
], Losses.Categorical_Cross_Entropy, Optimizers.SGD, learning_rate=0.01)
model.train(X_train, y_train, epochs=100, batch_size=10, verbose=True)
model.evaluate(X_test, y_test)

Answer 1

更改此行：

self.outputs = np.dot(self.weights.T, inputs) + self.biases

至

self.outputs = np.dot(inputs, self.weights.T) + self.biases

原因是內部尺寸需要對齊。 您的inputs形狀為[B,784] （其中B是批量大小），您的權重形狀為[32,784] 。

ValueError：形狀（784,32）和（10,784）未對齊：32（dim 1）！= 10（dim 0）對於神經網絡

問題描述

1 個解決方案

解決方案1
1 已采納 2022-06-20 00:35:27

ValueError：形狀（784,32）和（10,784）未對齊：32（dim 1）！= 10（dim 0）對於神經網絡

問題描述

1 個解決方案

解決方案1 1 已采納 2022-06-20 00:35:27

解決方案1
1 已采納 2022-06-20 00:35:27