简体   繁体   中英

Predicting and Training in different threads Keras Tensorflow

I am using Keras and Tensorflow to make a kind-of online learning, where I receive new data periodically and I retrain my models with this new data. I can have several models stored in ".h5" files so that when i need to train or predict I load the model and then I perform the necessary operations.

Currently I separated the training and the predictions in two different threads, so that predictions can be made while the other thread trains. With locks I try to make sure that no prediction or training is done in the same model at the same time (I think this works), but I am aware that keras is not so prepared for this. I always some different errors regarding the graph or session of tensorflow, for instance:

Traceback (most recent call last): File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\flask\\app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\flask\\app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\flask\\app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\flask_compat.py", line 35, in reraise raise value File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\flask\\app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\flask\\app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "C:\\Users\\a70357 2\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 859, in predict_times 0] + '.h5') File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 164, in get_prediction model, scaler = self.load_model_file(self.graph_pred, self.session, path) File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 114, in load_model_file model = load_model(path) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\saving.py", line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\saving.py", line 287, in _deserialize_model K.batch_set_value(weight_value_tuples) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\backend\\tensorflow_backend.py", line 2470, in batch_set_value get_session().run(assign_ops, feed_dict=feed_dict) File "C:\\Users\\a703572\\AppData\\Local \\Programs\\Python\\Python36\\lib\\site-packages\\keras\\backend\\tensorflow_backend.py", line 206, in get_session session.run(tf.variables_initializer(uninitialized_vars)) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 2831, in variables_initializer return control_flow_ops.group(*[v.initializer for v in var_list], name=name) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\control_flow_ops.py", line 3432, in group return _GroupControlDeps(dev, deps, name=name) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\control_flow_ops.py", line 3384, in _GroupControlDeps return no_op(name=name) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\contextlib.py", line 88, in exit next(self.gen) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\framework\\ops.py", line 4249, in device self._device_function_stac k.pop_obj() File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\framework\\traceable_stack.py", line 110, in pop_obj return self._stack.pop().obj IndexError: pop from empty list

Or the error:

Exception in thread Thread-1: Traceback (most recent call last): File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\threading.py", line 916, in _bootstrap_inner self.run() File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\threading.py", line 1182, in run self.function(*self.args, **self.kwargs) File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 632, in train self.update_prediction_historics_all() File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 649, in update_prediction_historics_all self.update_prediction_historics_dataset(new_dataset, loadModel=True) File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 672, in update_prediction_historics_dataset 0] + ".h5", loadModel=loadModel)[ File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 198, in get_predictions_sequential model, scaler = self.load_model_file(self.graph_pred, self.session, p ath) File "C:\\Users\\a703572\\PycharmProjects\\ai-pred-eng\\src\\run_keras_server.py", line 114, in load_model_file model = load_model(path) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\saving.py", line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\saving.py", line 225, in _deserialize_model model = model_from_config(model_config, custom_objects=custom_objects) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\saving.py", line 458, in model_from_config return deserialize(config, custom_objects=custom_objects) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\layers__init__.py", line 55, in deserialize printable_module_name='layer') File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\utils\\generic_utils.py", line 1 45, in deserialize_keras_object list(custom_objects.items()))) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\sequential.py", line 301, in from_config model.add(layer) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\sequential.py", line 181, in add output_tensor = layer(self.outputs[0]) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\base_layer.py", line 431, in call self.build(unpack_singleton(input_shapes)) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\layers\\core.py", line 872, in build constraint=self.bias_constraint) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\legacy\\interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\engine\\base_layer.py", line 252, in add_weight const raint=constraint) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\site-packages\\keras\\backend\\tensorflow_backend.py", line 402, in variable v = tf.Variable(value, dtype=tf.as_dtype(dtype), name=name) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 183, in call return cls._variable_v1_call(*args, **kwargs) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 146, in _variable_v1_call aggregation=aggregation) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 125, in previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variable_scope.py", line 2444, in default_variable_creator expected_shape=expected_shape, import_scope=import_scope) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\P ython36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 187, in call return super(VariableMetaclass, cls). call (*args, **kwargs) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 1329, in init constraint=constraint) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\ops\\variables.py", line 1492, in _init_from_args ops.add_to_collections(collections, self) File "C:\\Users\\a703572\\AppData\\Local\\Programs\\Python\\Python36\\lib\\contextlib.py", line 88, in exit next(self.gen) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\framework\\ops.py", line 5347, in init_scope yield File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\framework\\ops.py", line 4369, in exit self._graph._pop_control_dependencies_controller(self) File "C:\\Users\\a703572\\AppData\\Roaming\\Python\\Python36\\site-packages\\tensorflow\\python\\framework\\ops.py", line 4390, in _pop_control_dependencies_controller assert self._control_dependencies_stack[-1] is controller Asser tionError

My solution was using a graph for prediction and a graph for training, and every time I want to perform a tf operation I use:

with server_predict.graph_pred.as_default():
    with tf.Session(graph=server_predict.graph_pred) as sess:

And I also added the line:

        backend.set_session(sess)

Despite this, I keep having the errors coming from the tf session or graph, as It seems that the operations are not properly separated. Another error is the one I wrote in this issue that is still opened, regarding the tf session. Solution given using k.clear_session() (k = keras backend) did not work for me.

Does any one have had a similar problem or has programmed a similar task that might help me?

Thanks!!


Found a "wrap" to make this work. Instead of launching two threads over the same class (custom), what I have is two objects of the same class, one is dedicated to training and the other to predict. This is not a real multithread app (even though the two objects are launched from the same main). Until I (we) find a proper multithread solution this might help.

However I do not understand know how I got the errors before, and just by having two objects not, even if these objects run in the same process. Is it that keras/tensorflow can only make operations on only one graph but defines different graphs for different objects on the same process?

Easiest solution is to have two separate keras models - the first runs in inference mode, and the second runs in training mode. Every time the inference model gets a new dataset to predict on, it first checks to see if it has the most "up to date" .h5 file, if not then it loads it in first then runs the prediction. This way you can avoid locks and such.

It's hard to give advice specific to your case because what you want is likely not the same as what I need

  • This is my opinion after having done something similar with Tensorflow Multiprocessing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM