简体   繁体   中英

Best Way to Overcome Early Convergence for Machine Learning Model

I have a machine learning model built that tries to predict weather data, and in this case I am doing a prediction on whether or not it will rain tomorrow (a binary prediction of Yes/No).

In the dataset there is about 50 input variables, and I have 65,000 entries in the dataset.

I am currently running a RNN with a single hidden layer, with 35 nodes in the hidden layer. I am using PyTorch's NLLLoss as my loss function, and Adaboost for the optimization function. I've tried many different learning rates, and 0.01 seems to be working fairly well.

After running for 150 epochs, I notice that I start to converge around .80 accuracy for my test data. However, I would wish for this to be even higher. However, it seems like the model is stuck oscillating around some sort of saddle or local minimum. (A graph of this is below)

What are the most effective ways to get out of this "valley" that the model seems to be stuck in?

测试损失为红色,训练损失为蓝色

Not sure why exactly you are using only one hidden layer and what is the shape of your history data but here are the things you can try:

  1. Try more than one hidden layer
  2. Experiment with LSTM and GRU layer and combination of these layers together with RNN.
  3. Shape of your data ie the history you look at to predict the weather.
  4. Make sure your features are scaled properly since you have about 50 input variables.

Your question is little ambiguous as you mentioned RNN with a single hidden layer. Also without knowing the entire neural network architecture, it is tough to say how can you bring in improvements. So, I would like to add a few points.

  • You mentioned that you are using "Adaboost" as the optimization function but PyTorch doesn't have any such optimizer. Did you try using SGD or Adam optimizers which are very useful?

  • Do you have any regularization term in the loss function? Are you familiar with dropout? Did you check the training performance? Does your model overfit?

  • Do you have a baseline model/algorithm so that you can compare whether 80% accuracy is good or not?

150 epochs just for a binary classification task looks too much. Why don't you start from an off-the-shelf classifier model? You can find several examples of regression, classification in this tutorial .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM