[英]Multilayer Perceptron Fails to Converge
This is a simple MLP I am writing for binary image classification, with backpropagation: 这是我正在使用反向传播为二进制图像分类编写的简单MLP:
class MLP:
def __init__(self, size, epochs = 1000, learning_rate = 1):
self.l1weights = numpy.random.random((size + 1, 3))
self.l2weights = numpy.random.random(3)
self.epochs = epochs
self.learning_rate = learning_rate
def predict(self, _input_):
#Append bias at the beginning of input
l1output = self.sigmoid(numpy.dot(numpy.append([1], _input_), self.l1weights))
l2output = self.sigmoid(numpy.dot(l1output, self.l2weights))
return l1output, l2output
def train(self, training_set, training_goal):
for epoch in range(self.epochs):
l1squared_error = 0
l2squarederror = 0
for set_index in range(training_goal.shape[0]):
set = training_set[set_index]
l1output, l2output = self.predict(set)
l2error = training_goal[set_index] - l2output
l1error = l2error * self.dsigmoid(l2output) * self.l2weights
self.l1weights[0] = self.l1weights[0] + self.learning_rate * l1error
for index in range(len(self.l1weights) - 1):
self.l1weights[index + 1] += self.learning_rate * l1error * self.dsigmoid(l1output)
for index in range(len(self.l2weights)):
self.l2weights[index] += self.learning_rate * l2error * self.dsigmoid(l2output)
l1squared_error += sum(l1error ** 2)
l2squarederror += l2error ** 2
print("Squared error at epoch " + str(epoch) + " : " + str(l1squared_error) + ", " + str(l2squarederror))
def sigmoid(self, _input_):
#Sigmoid sigmoid function
return 1 / (1 + numpy.exp(-_input_))
def dsigmoid(self, _input_):
return _input_ * (1 - _input_)
When run sometimes all output converges into 1 but for some reason the predictions for 0 converge into 0.5 while predictions for 1 stay near 0.75, with error from layer 2 staying the same after ~1000 epochs, if it does relatively more successfully. 运行时,有时所有输出会收敛为1,但由于某种原因,对0的预测会收敛到0.5,而对于1的预测会保持在0.75附近,如果能够比较成功,则第2层的误差会在〜1000个周期后保持不变。 This is from testing with 2x2 image classification with the code below:
这来自使用以下代码对2x2图像分类进行的测试:
def image_class(input):
return 1 if input >= 2 else 0
training_set = ((numpy.arange(2**4)[:,None] & (1 << numpy.arange(4))) != 0)
training_goals = numpy.array([image_class(sum(i)) for i in training_set])
mlp = MLP(size=4)
mlp.train(training_set, training_goals)
我可以通过以下步骤解决此问题:在输出层之后立即添加逐步激活而不是Sigmoid的层,并将其与初始网络分开进行训练,至少使用2x2识别。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.