I trained quora question pair detection with LSTM but training accuracy is very low and always changes when i train. I dont understand what mistake i did.
I tried changing loss and optimiser and with increased epoch.
import numpy as np
from numpy import array
from keras.callbacks import ModelCheckpoint
import keras
from keras.optimizers import SGD
import tensorflow as tf
from sklearn import preprocessing
import xgboost as xgb
from keras import backend as K
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from keras.preprocessing.text import Tokenizer , text_to_word_sequence
from keras.preprocessing.sequence import pad_sequences
from keras.layers.embeddings import Embedding
from keras.models import Sequential, model_from_json, load_model
from keras.layers import LSTM, Dense, Input, concatenate, Concatenate, Activation, Flatten
from keras.models import Model
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer,CountVectorizer
import nltk
from nltk.stem.lancaster import LancasterStemmer
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
import pandas as pd
import scipy
import matplotlib.pyplot as plt
import pickle
df = pd.read_csv("questions.csv")
df.drop(['id','qid1', 'qid2'], axis=1, inplace=True)
df2 = pd.read_csv("testmenew.csv")
## TO filter the datset
filename = 'newoutput2.h5'
model.load_weights(filename)
new = model.predict(TestInput)
if new > 0.6:
print("Duplication detected")
else:
print("No duplicate")
new
giving output around 0.6567 but not atall increasing, Please help !!
Train on 323480 samples, validate on 80871 samples Epoch 1/10 323480/323480 [==============================] - 27s 83us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 2/10 323480/323480 [==============================] - 24s 73us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 3/10 323480/323480 [==============================] - 23s 71us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 4/10 323480/323480 [==============================] - 23s 71us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 5/10 323480/323480 [==============================] - 23s 72us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 6/10 323480/323480 [==============================] - 23s 71us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 7/10 323480/323480 [==============================] - 23s 71us/step - loss: 0.6931 - acc: 0.6304 - val_ loss: 0.6931 - val_acc: 0.6323 Epoch 8/10 323480/323480 [==============================] - 25s 76us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 9/10 323480/323480 [==============================] - 25s 78us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323 Epoch 10/10 323480/323480 [==============================] - 25s 78us/step - loss: 0.6931 - acc: 0.6304 - val_loss: 0.6931 - val_acc: 0.6323
filename = 'newoutput2.h5' model.load_weights(filename) new = model.predict(TestInput) if new > 0.6: print("Duplication detected") else: print("No duplicate") new giving output around 0.6567 but not atall increasing, Please help !!
I need to Increase accuracy of training
There're couple of options to increase the accuracy:
1) Increase the hidden layers in the LSTM node. and/or 2) add another layer of the LSTM. Only 1 hidden layer may not be sufficient for the training of your data.
After making changes in the model as above, you will probably see the stabilization of the accuracy in some range. Based on that you can adjust the other parameters.
Another note: You will need to enable the embedding layer to convert words to vectors.
There are 4 ways to improve deep learning performance:
Improve Performance With Data:
Improve Performance With Algorithms
Improve Performance With Algorithm Tuning
some ideas on tuning your neural network algorithms in order to get more out of them.
Improve Performance With Ensembles
three general areas of ensembles you may want to consider:
check below link for further information: https://machinelearningmastery.com/improve-deep-learning-performance/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.