[英]Preload whole dataset on gpu for training Keras model
I have a specific case where the networks are relatively tiny and for convergence and generalization matters I should maintain small batch sizes (eg 256), which leads to hundreds of batches to process per epoch.我有一个特定的案例,其中网络相对较小,对于收敛和泛化问题,我应该保持小批量(例如 256),这导致每个 epoch 处理数百个批次。
Unfortunately, in this scenario batch, loading, and loss calculation becomes a bottleneck (as timeline
tool tells me).不幸的是,在这种情况下,批处理、加载和损失计算成为瓶颈(正如
timeline
工具告诉我的那样)。
In TensorFlow, you can write something like this to load the data on the GPU:在 TensorFlow 中,您可以编写如下代码来将数据加载到 GPU 上:
with tf.device('/gpu:0'):
train_data = tf.constant(train_data_numpy)
But if I pass train_data
to Keras Model.predict
or Model.fit
functions, I get the following error:但是,如果我将
train_data
传递给Model.predict
或Model.fit
函数, Model.predict
出现以下错误:
keras/engine/training.pyc in predict(self, x, batch_size, verbose)
1515 f = self.predict_function
1516 return self._predict_loop(f, ins,
-> 1517 batch_size=batch_size, verbose=verbose)
1518
1519 def train_on_batch(self, x, y,
keras/engine/training.pyc in _predict_loop(self, f, ins, batch_size, verbose)
1129 if verbose == 1:
1130 progbar = Progbar(target=samples)
-> 1131 batches = _make_batches(samples, batch_size)
1132 index_array = np.arange(samples)
1133 for batch_index, (batch_start, batch_end) in enumerate(batches):
keras/engine/training.pyc in _make_batches(size, batch_size)
368 A list of tuples of array indices.
369 """
--> 370 num_batches = int(np.ceil(size / float(batch_size)))
371 return [(i * batch_size, min(size, (i + 1) * batch_size))
372 for i in range(0, num_batches)]
AttributeError: 'Dimension' object has no attribute 'ceil'
Which makes sense, since Keras expects only NumPy-like arrays and lists of such.这是有道理的,因为 Keras 只需要类似 NumPy 的数组和列表。
Having said that, I also tried pyCUDA and cupy arrays, since they say to be NumPy-like... but those produce the following errors:话虽如此,我还尝试了pyCUDA和cupy数组,因为它们说是 NumPy 之类的……但这些会产生以下错误:
keras/engine/training.pyc in predict(self, x, batch_size, verbose)
1515 f = self.predict_function
1516 return self._predict_loop(f, ins,
-> 1517 batch_size=batch_size, verbose=verbose)
1518
1519 def train_on_batch(self, x, y,
keras/engine/training.pyc in _predict_loop(self, f, ins, batch_size, verbose)
1139 ins_batch = _slice_arrays(ins, batch_ids)
1140
-> 1141 batch_outs = f(ins_batch)
1142 if not isinstance(batch_outs, list):
1143 batch_outs = [batch_outs]
keras/backend/tensorflow_backend.pyc in __call__(self, inputs)
2266 updated = session.run(self.outputs + [self.updates_op],
2267 feed_dict=feed_dict,
-> 2268 **self.session_kwargs)
2269 return updated[:len(self.outputs)]
2270
tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
893 try:
894 result = self._run(None, fetches, feed_dict, options_ptr,
--> 895 run_metadata_ptr)
896 if run_metadata:
897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
1091 feed_handles[subfeed_t] = subfeed_val
1092 else:
-> 1093 np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
1094
1095 if (not is_tensor_handle_feed and
numpy/core/numeric.pyc in asarray(a, dtype, order)
529
530 """
--> 531 return array(a, dtype, copy=False, order=order)
532
533
ValueError: object __array__ method not producing an array
I tried googling this issue, but the only reasonable match is some Chinese blog post, which basically suggests patching Keras, which is impractical obviously.我试着用谷歌搜索这个问题,但唯一合理的匹配是一些中文博客文章,基本上建议修补Keras,这显然不切实际。
I wonder what is the correct way to preload the whole dataset on GPU for Keras.我想知道在 GPU 上为 Keras 预加载整个数据集的正确方法是什么。
Useful info: I am using Keras 2.0.6 with TF 1.3, upgrading to 2.0.8/1.4 stack is yet unavailable due to crucial API changes, but would definitely be sped up in case it solves this issue.有用信息:我正在使用带有 TF 1.3 的 Keras 2.0.6,由于关键的 API 更改,升级到 2.0.8/1.4 堆栈尚不可用,但如果它解决了这个问题,肯定会加快速度。
You don't have to load the whole data.您不必加载整个数据。 You can ingest the data piece by piece using the DataSet class.
您可以使用DataSet类逐条摄取数据。
Tensorflow can take care of loading more data while your gpu is crunching your numbers.当您的 GPU 处理您的数字时,Tensorflow 可以负责加载更多数据。 You can follow the below steps.
您可以按照以下步骤操作。
You can check the example listed here .您可以查看此处列出的示例。
Hope this is helpful.希望这是有帮助的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.