简体   繁体   English

训练神经网络用遗传算法玩飞鸟 - 为什么不能学习?

[英]Training Neural Network to play flappy bird with genetic algorithm - Why can't it learn?

I have been learning about neural networks and genetic algorithms, and to test my learning, have tried to make an AI that learns to play flappy bird: 我一直在学习神经网络和遗传算法,并且为了测试我的学习,尝试制作一个学习玩飞鸟的人工智能:

程序截图

I have left it running for at least 10 hours (overnight and longer), but the fittest member still fails to show any significant advancements in intelligence from when I began the simulation apart from avoiding the floor and ceilings. 我让它运行了至少10个小时(一夜之间和更长时间),但最适合的成员仍然没有表现出从我开始模拟时避开地板和天花板的智能方面的任何显着进步。 The inputs are the rays (as you can see above) that act as sight lines, and the network is fed in their lengths, and the birds vertical velocity. 输入是作为视线的光线(如上所示),网络以其长度和鸟类垂直速度馈送。 It seems that the best bird is essentially ignoring all the sight lines except the horizontal one, and when it is very short, it is jumping. 看起来最好的鸟基本上忽略了除水平线以外的所有视线,当它非常短时,它就是跳跃。 The output is a number between 0 and 1, if the output is larger than 0.5, then the bird jumps. 输出是介于0和1之间的数字,如果输出大于0.5,则鸟跳跃。 There are 4 hidden layers, with 15 neurons each, with the input layer feeding forward to the first hidden layers, then the 1st hidden layer feeding forward to the 2nd one ... and the final hidden layer feeding forward to the output, the dna of a bird is an array of real numbers representing the weights of the neural networks, I have made another project using the same style of neural network, and genetic algorithm, in which ants had to travel to food, and it worked perfectly. 有4个隐藏层,每个有15个神经元,输入层向前馈送到第一个隐藏层,然后第一个隐藏层向前馈送到第二个...并且最后一个隐藏层向前馈送到输出,dna一只鸟是一组代表神经网络权重的实数,我用同样的神经网络和遗传算法制作了另一个项目,其中蚂蚁不得不前往食物,它完美地运作。

Here is the code: https://github.com/Karan0110/flappy-bird-ai 这是代码: https//github.com/Karan0110/flappy-bird-ai

Please say in the comments if you need any additional information 如果您需要任何其他信息,请在评论中说明

Please can you say whether my method is flawed or not, as I am almost certain the code works correctly (I got from the previous working project). 请问你的方法是否有缺陷,因为我几乎可以肯定代码是否正常工作(我从之前的工作项目中获得)。

I like your idea, but I suggest you change some things. 我喜欢你的想法,但我建议你改变一些事情。

  • Don't use a network with a fixed structure. 不要使用具有固定结构的网络。 Look up Neural evolution of autgmenting topologies and rather implement it yourself, or use a library like neataptic. 查找自动分析拓扑的神经演化,而不是自己实现,或使用像neataptic这样的库。

    • I don't believe your network needs that many inputs. 我不相信你的网络需要那么多输入。 I believe 3-5 sensors (20-50° gaps) would be enough, since many of the input values seem to be very similar. 我相信3-5个传感器(20-50°间隙)就足够了,因为许多输入值看起来非常相似。

If you are not sure why exactly your project is not working try this: 如果您不确定为什么您的项目无法正常运行,请尝试以下方法:

  • Try view an image of your current best network. 尝试查看当前最佳网络的图像。 If the network doesn't take important sensors (like the velocity) into account, you'll see it instantly. 如果网络没有考虑重要的传感器(如速度),你会立即看到它。

  • Make sure all of your sensors are working fine (looks fine in the image above) and be sure to nkrmalize the values in a meaningful way. 确保所有传感器工作正常(在上图中看起来很好),并确保以有意义的方式对值进行nkrmalize。

  • Check if the maximum & average score increases over time. 检查最高和平均分数是否随时间增加。 If it doesn't your GA isn't working properly or your networ receives inputs that are not good enough to solve the problem. 如果您的GA没有正常工作,或者您的网络收到的输入不足以解决问题。

One trick that helped me out a lot, is to keep the elite of the GA in a seperate array. 帮助我解决很多问题的一个技巧是将GA的精英分成一个单独的阵列。 Only replace elite networks if some other network has performed better than the elite. 如果某些其他网络的表现优于精英,则只能取代精英网络。 Keep the elite trough all the generations, so once your algorithm finds an extraordinarily good solution, it won't be lost in any future generation if nothing else performs better. 保持精英阶梯的所有代,所以一旦你的算法找到一个非常好的解决方案,如果没有其他更好的表现,它将不会在任何下一代丢失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM