简体   繁体   中英

Calculating Perplexity and Memory Issues in Keras/Tensorflow

I'd like to evaluate my model with Perplexity after each training epoch. I'm using Keras with Tensorflow backend. The problem is, that after each evaluation more and more memory is used but never released. So after a few epochs my system crashes. It would work without the memory issue if I'm not using keras and tensorflow functions. But then it would be waaay too slow. Here is the code:

def compute_perplexity(self, modelName, sentences):
    all_labels, all_predictions = self.predictLabels_for_perplexity_evaluation(self.models[modelName], sentences)
    # add an axis to fit tensor shape
    for i in range(len(all_labels)):
        all_labels[i] = all_labels[i][:,:, np.newaxis]

#calculate perplexity for each sentence length and each datapoint and append to list
perplexity = []
for i in range(10,15): #range(len(all_labels)):
    start = time.time()
    xentropy = K.sparse_categorical_crossentropy(tf.convert_to_tensor(all_labels[i]), tf.convert_to_tensor(all_predictions[i]))
    perplexity.append(K.eval(K.pow(2.0, xentropy)))
    print('time for one set of sentences. ', time.time()- start)
#average for each datapoint
for i in range(len(perplexity)):
    perplexity[i] = np.average(perplexity[i], axis=1)
    perplexity[i] = np.average(perplexity[i])

return np.mean(perplexity)

There is no need to evaluate this metric using TensorFlow, what you code does is to add the all_labels array to the graph each time it is called, which explains the memory usage you are seeing.

Consider implementing all this computation using numpy, or making an operation that you evaluate with new data in a session using feed_dict (without using tf.convert_to_tensor ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM