[英]Train a reinforcement learning model with a large amount of images
I am tentatively trying to train a deep reinforcement learning model the maze escaping task, and each time it takes one image as the input (eg, a different "maze").我暂时尝试训练一个深度强化学习模型来完成迷宫逃逸任务,并且每次都以一张图像作为输入(例如,不同的“迷宫”)。
Suppose I have about 10K different maze images, and the ideal case is that after training N mazes, my model would do a good job to quickly solve the puzzle in the rest 10K - N images.假设我有大约 10K 个不同的迷宫图像,理想情况是训练 N 个迷宫后,我的模型可以很好地快速解决其余 10K - N 个图像中的难题。
I am writing to inquire some good idea/empirical evidences on how to select a good N for the training task.我写信是为了询问一些关于如何为训练任务选择好的 N 的好主意/经验证据。
And in general, how should I estimate and enhance the ability of "transfer learning" of my reinforcement model?总的来说,我应该如何评估和增强强化模型的“迁移学习”能力? Make it more generalized?让它更普遍?
Any advice or suggestions would be appreciate it very much.任何意见或建议将不胜感激。 Thanks.谢谢。
Firstly,首先,
I strongly recommend you to use 2D arrays for the maps of the mazes instead of images , it would do your model a huge favor, becuse it's a more feature extracted approach .我强烈建议您使用2D 数组作为迷宫地图而不是图像,它会对您的模型大有帮助,因为它是一种更多特征提取的方法。 try using 2D arrays in which walls are demonstrated by ones upon the ground of zeros.尝试使用 2D 阵列,其中墙壁由零基础上的 1 证明。
And about finding the optimized N:关于找到优化的 N:
Your model architecture is way more important than the share of training data in all of the data or the batch sizes .您的模型架构比训练数据在所有数据中的份额或批次大小更重要。 It's better to make a well designed model and then to find the optimized amount of N by testing different Ns(becuse it is only one variable, the process of optimizing N can be easily done by you yourself).最好先做一个设计好的模型,然后通过测试不同的Ns来找到优化的N量(因为它只是一个变量,优化N的过程你自己很容易完成)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.