简体繁体 English

添加到数据集后重新训练pybrain神经网络

[英]Retrain a pybrain neural network after adding to the dataset

原文 2011-12-20 21:57:03 4 1 python/ neural-network/ pybrain

I have a pybrain NN up and running, and it seems to be working rather well. 我有一个pybrain NN启动并运行，它似乎工作得相当好。 Ideally, I would like to train the network and obtain a prediction after each data point (the previous weeks figures, in this case) has been added to the dataset. 理想情况下，我希望训练网络并在每个数据点（前几周数据，在这种情况下）已添加到数据集后获得预测。

At the moment I'm doing this by rebuilding the network each time, but it takes an increasingly long time to train the network as each example is added (+2 minutes for each example, in a dataset of 1000s of examples). 目前我正在通过每次重建网络来实现这一目标，但是在添加每个示例时需要花费越来越长的时间来训练网络（每个示例在数千个示例的数据集中为+2分钟）。

Is there a way to speed up the process by adding the new example to an already trained NN and updating it, or am I overcomplicating the matter, and would be better served by training on a single set of examples (say last years data) and then testing on all of the new examples (this year)? 有没有办法通过将新的示例添加到已经训练过的NN并更新它来加快这个过程，或者我是否过度复杂化了这个问题，并且通过对一组示例（比如去年的数据）进行培训可以更好地服务然后测试所有新的例子（今年）？

1 个解决方案

It dependes of what is your objective. 它取决于你的目标是什么。 If you need an updated NN-model you can perform an online training, ie performing a single step of back-propagation with the sample acquired at time $t$ starting from the network you had at time $t-1$. 如果您需要更新的NN模型，您可以执行在线培训，即从$ t-1 $时间网络开始，在$ t $时间获取样本执行单步反向传播。 Or maybe you can discard the older samples in order to have a fixed amount of training samples or you can reduce the size of the training set performing a sort of clustering (ie merging similar samples into a single one). 或者，您可以丢弃旧样本以获得固定数量的训练样本，或者您可以减少执行某种聚类的训练集的大小（即将类似的样本合并为单个样本）。

If you explain better your application it'd be simpler suggesting solutions. 如果您更好地解释您的应用程序，那么建议解决方案会更简单。