简体繁体 English

关于Tensorflow中的入队顺序

[英]About enqueue order in Tensorflow

原文 2017-12-23 15:36:05 1 1 python/ multithreading/ tensorflow

I am relatively a beginner of Python and Tensorflow. 我相对来说是Python和Tensorflow的初学者。 And when I was learning the threading and queues part of Tensorflow, I was a little confused. 当我学习Tensorflow的线程和队列部分时，我有些困惑。

So when we use multi-threads (generated by QueueRunner) to enqueue one queue from the data source, what the enqueue order will be? 因此，当我们使用多线程（由QueueRunner生成）从数据源中排队一个队列时，排队顺序是什么？ Will the data in the queue keep the original order in the data source? 队列中的数据是否会保持数据源中的原始顺序？ If so, how to achieve this with multi-threads? 如果是这样，如何用多线程实现呢？ If not, why do we need RandomShuffleQueue (if the enqueue order is not fixed, shuffle the dequeue order seems a little redundant)? 如果不是，为什么我们需要RandomShuffleQueue（如果入队顺序不固定，则改组出队顺序似乎有点多余）？

Thank you in advance. 先感谢您。

1 个解决方案

First of all, you should not use queues/queue runners for input pipeline anymore. 首先，你不应该使用队列/队列亚军输入管道了。 Use Datasets API instead. 请改用Datasets API 。 You should find it more intuitive and much simpler. 您应该发现它更直观，更简单。

To answer your questions: 要回答您的问题：

Re: So when we use multi-threads (generated by QueueRunner) to enqueue one queue from the data source, what the enqueue order will be? 回复：那么，当我们使用多线程（由QueueRunner生成）从数据源中排队一个队列时，排队顺序是什么？

The order is not defined. 顺序未定义。 It depends on thread scheduling. 这取决于线程调度。

Re: Will the data in the queue keep the original order in the data source? 回复：队列中的数据会在数据源中保持原始顺序吗？

The data source is usually a set of files and usually a given file is processed by a single thread and a single reader. 数据源通常是一组文件，通常给定文件是由单个线程和单个读取器处理的。 If that is the case, order of examples coming from one file will be preserved (unless you purposefully shuffle them, eg using RandomShuffleQueue instead of FIFOQueue) 在这种情况下，将保留来自一个文件的示例顺序（除非您有意地对它们进行洗牌，例如，使用RandomShuffleQueue而不是FIFOQueue）

Re: why do we need RandomShuffleQueue? 回复：为什么我们需要RandomShuffleQueue？

In the common scenario above, if you don't shuffle examples, their order will be preserved. 在上面的常见场景中，如果您不随机播放示例，则它们的顺序将被保留。 You might want to shuffle your examples for training because they might follow some order within a file. 您可能希望改组示例进行培训，因为它们可能遵循文件中的某些顺序。 Also, while the order across files is not deterministic, it is far from uniform random. 同样，尽管文件间的顺序不确定，但远非统一的随机性。 You might want to achieve (a good approximation to) uniform random order across your entire data set. 您可能希望在整个数据集上实现（非常近似）统一的随机顺序。