[英]Hyper-parameter tuning and Over-fitting with Feed-Forward Neural Network - Mini-Batch Epoch and Cross Validation
I am looking at implementing a hyper-parameter tuning method for a feed-forward neural network (FNN) implemented using PyTorch
.我正在考虑为使用
PyTorch
实现的前馈神经网络 (FNN) 实现超参数调整方法。 My original FNN , the model is named net
, has been implemented using a mini-batch learning approach with epochs:我最初的 FNN 模型名为
net
,已使用具有 epochs 的小批量学习方法实现:
#Parameters
batch_size = 50 #larger batch size leads to over fitting
num_epochs = 1000
learning_rate = 0.01 #was .01-AKA step size - The amount that the weights are updated during training
batch_no = len(x_train) // batch_size
criterion = nn.CrossEntropyLoss() #performance of a classification model whose output is a probability value between 0 and 1
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
if epoch % 20 == 0:
print('Epoch {}'.format(epoch+1))
x_train, y_train = shuffle(x_train, y_train)
# Mini batch learning - mini batch since batch size < n(batch gradient descent), but > 1 (stochastic gradient descent)
for i in range(batch_no):
start = i * batch_size
end = start + batch_size
x_var = Variable(torch.FloatTensor(x_train[start:end]))
y_var = Variable(torch.LongTensor(y_train[start:end]))
# Forward + Backward + Optimize
optimizer.zero_grad()
ypred_var = net(x_var)
loss =criterion(ypred_var, y_var)
loss.backward()
optimizer.step()
I lastly test my model on a separate test set.我最后在一个单独的测试集上测试我的模型。
I came across an approach using randomised search to tune the hyper-parameters as well as implementing K-fold cross-validation ( RandomizedSearchCV
).我遇到了一种使用随机搜索来调整超参数以及实现 K 折交叉验证 (
RandomizedSearchCV
) 的方法。
My question is two-fold(no pun intended!) and firstly is theoretical: Is k-fold validation is necessary or could add any benefit to mini-batch feed-forward neural network?我的问题是双重的(没有双关语!)首先是理论上的:k 折验证是必要的还是可以为小批量前馈神经网络增加任何好处? From what I can see, the mini-batch approach should do roughly the same job, stopping over-fitting.
据我所知,小批量方法应该做大致相同的工作,停止过度拟合。
I also found a good answer here but I'm not sure this addresses a mini-batch approach approach specifically.我也在这里找到了一个很好的答案,但我不确定这是否专门解决了小批量方法。
Secondly, if k-fold is not necessary, is there another hyper-parameter tuning function for PyTorch
to avoid manually creating one?其次,如果不需要k-fold,
PyTorch
是否有另一个超参数调整功能来避免手动创建一个?
Small batches can offer a regularizing effect (Wilson and Martinez, 2003), perhaps due to the noise they add to the learning process.
小批量可以提供正则化效果(Wilson 和 Martinez,2003),这可能是由于它们添加到学习过程中的噪音。 Generalization error is often best for a batch size of 1. Training with such a small batch size might require a small learning rate to maintain stability because of the high variance in the estimate of the gradient.
泛化误差通常最适合批量大小为 1 的情况。使用如此小的批量大小进行训练可能需要很小的学习率来保持稳定性,因为梯度估计的差异很大。 The total runtime can be very high as a result of the need to make more steps, both because of the reduced learning rate and because it takes more steps to observe the entire training set.
由于需要执行更多步骤,因此总运行时间可能非常高,这既是因为学习率降低,又是因为需要更多步骤来观察整个训练集。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.