简体繁体 English

Keras模型（RNN）预测Floydhub上的奇怪输出

[英]Keras model (RNN) predicting strange output on Floydhub

原文 2018-07-21 14:41:05 6 1 python/ keras

I am very confused about something. 我对某些事情很困惑。 As far as I can tell, something is very wrong with my model only when I save it through Floydhub. 据我所知，只有当我通过Floydhub保存模型时，我的模型才有问题。 I'm not even sure if the issue is with Floydhub- but I don't understand what could be happening so I'm blaming it on them for now. 我什至不确定问题是否出在Floydhub上，但我不知道会发生什么，所以我现在将其归咎于他们。

I am trying to run a RNN using Keras that generates text on a word level. 我正在尝试使用Keras运行RNN，以在单词级别生成文本。

I'm using this dummy dataset (the data has a few issues, but for the purpose of this error report it should work). 我正在使用此虚拟数据集（数据有一些问题，但出于此错误报告的目的，它应该可以工作）。 If you look at the dataset, you'll see it has a fair amount of <newline> words. 如果查看数据集，您会发现它有相当数量的<newline>单词。

When training on my laptop and saving the model, I get predictions like this: <newline> <newline> jerry you the looks <newline> <newline> <newline> jerry you know <newline> <newline> bye the <newline> <newline> <newline> jerry flip a the the to <newline> <newline> jerry elaine you <newline> <newline> jerry elaine it <newline> <newline> jerry <newline> <newline> <newline> elaine have <newline> <newline> you jerry <newline> <newline> jerry her in <newline> <newline> <newline> i just <newline> <newline> <newline> <newline> <newline> he cometh back <newline> <newline> the you <newline> <newline> <newline> <newline> <newline> no of <newline> <newline> elaine <newline> <newline> <newline> <newline> elaine the me <newline> <newline> <newline> 在笔记本电脑上进行训练并保存模型时，我得到如下预测： <newline> <newline> jerry you the looks <newline> <newline> <newline> jerry you know <newline> <newline> bye the <newline> <newline> <newline> jerry flip a the the to <newline> <newline> jerry elaine you <newline> <newline> jerry elaine it <newline> <newline> jerry <newline> <newline> <newline> elaine have <newline> <newline> you jerry <newline> <newline> jerry her in <newline> <newline> <newline> i just <newline> <newline> <newline> <newline> <newline> he cometh back <newline> <newline> the you <newline> <newline> <newline> <newline> <newline> no of <newline> <newline> elaine <newline> <newline> <newline> <newline> elaine the me <newline> <newline> <newline>

However, when I train via Floydhub (using exactly the same code- only changing paths) and saving the model, I get stuff like this: strengths dont turns hu hosting sittin avoided yayou sittin them tie sittin hu tie turns turns he biography them hereand its battery car afternoon tie into into tie sittin thanks alone turns turns brilliant minute quones shhhhh folks its car turns turns brilliant minute location decided turns turns brilliant biography them sometimes sitting thanks thanks thanks closes turns turns jer grape thursday jerrys jerrys national biography comin turns turns brilliant grape hu drawn minute paper hu probably hu mashed again turns turns jer grape office larry jerrys shop coin lie hescrazylook turns turns jer grape hu decided surprised ive meatloaf 但是，当我通过Floydhub训练（使用完全相同的代码，仅更改路径）并保存模型时，我得到的是这样的： strengths dont turns hu hosting sittin avoided yayou sittin them tie sittin hu tie turns turns he biography them hereand its battery car afternoon tie into into tie sittin thanks alone turns turns brilliant minute quones shhhhh folks its car turns turns brilliant minute location decided turns turns brilliant biography them sometimes sitting thanks thanks thanks closes turns turns jer grape thursday jerrys jerrys national biography comin turns turns brilliant grape hu drawn minute paper hu probably hu mashed again turns turns jer grape office larry jerrys shop coin lie hescrazylook turns turns jer grape hu decided surprised ive meatloaf

Not a <newline> in sight, just a random selection of words in the vocabulary- no clear pattern that I can see whatsover. 看不到<newline> ，只是词汇表中随机选择的单词-我看不到清晰的模式。 Surely, something is wrong with the weights being saved on Floydhub. 当然，保存在Floydhub上的权重有问题。

If you go to my repository and check the output between training and prediction it should be easy to see the difference. 如果您转到我的存储库并检查训练和预测之间的输出，则应该很容易看到差异。

On the readme you'll see some instructions on how to train or run predictions. 在自述文件中，您将看到有关如何训练或运行预测的一些说明。 Again- everything works if I train and save on my home PC, so I honestly can't figure out what's going on with Floydhub. 同样，如果我在家用PC上进行训练和保存，一切都会正常，所以老实说，我无法弄清楚Floydhub的情况。 I'm not even sure if the problem is with Floydhub- so if anyone has any ideas about what could be causing this issue, or how I could more effectively debug the issue , please let me know. 我什至不确定问题是否出在Floydhub上，因此，如果有人对导致此问题的原因或如何更有效地调试问题有任何想法，请告诉我。

My code can be seen here (rnn.py) 我的代码可以在这里看到（rnn.py）

I'm completely stumped :confused: 我完全迷住了：confused：

Thanks 谢谢

1 个解决方案

Turns out set() doesn't return a stable order. 事实证明set()不会返回稳定的顺序。 To fix the issue I used sort() on my words set before the mapping and tokenising started. 为了解决这个问题，我在映射和标记化开始之前对单词集使用了sort() 。

The reason it was working on my laptop and not on Floydhub is because set() maintains a stable order when inside the same environment, although of course on Floydhub each time you run it, you run it in a different environment. 它在我的笔记本电脑上而不在Floydhub上运行的原因是因为set()在相同环境中时可以保持稳定的顺序，尽管当然每次都在Floydhub上运行它时，都可以在不同的环境中运行它。