简体繁体 English

在使用Keras的模型时需要帮助

[英]Need help using Keras' model.predict

原文 2018-05-13 19:28:44 7 1 python/ tensorflow/ machine-learning/ keras/ predict

My goal is to make an easy neural network fit by providing 2 verticies of a certain Graph and 1 if there's a link or 0 if there's none. 我的目标是通过提供某个图的2个折点，如果有链接则提供1个顶点，如果没有链接则提供0个顶点，从而使简单的神经网络拟合。

I fit my model, it gets loss of about 0.40, accuracy of about 83% during fitting. 我拟合了模型，拟合期间损失约0.40，准确性约83％。 I then evaluate the model by providing a batch of all positive samples and several batches of negative ones (utilising random.sample). 然后，我通过提供一批所有正样本和几批负样本（利用random.sample）来评估模型。 My model gets loss of ~0.35 and 1.0 accuracy for positive samples and ~0.46 loss 0.68 accuracy for negative ones. 对于正样本，我的模型损失约0.35和1.0的准确性，对于负样本，模型损失约0.46和0.68的准确性。

My understanding of neural networks if extremely limited, but to my understanding the above means it theoretically always is right when it outputs 0 when there's no link, but can sometimes output 1 even if there is none. 我对神经网络的理解是极其有限的，但据我所知，从理论上来说，当在没有链接的情况下输出0时，从理论上讲它总是正确的，但有时即使没有也可以输出1。

Now for my actual problem: I try to "reconstruct" the original graph with my neural network via model.predict. 现在针对我的实际问题：我尝试通过model.predict用我的神经网络“重建”原始图形。 The problem is I don't understand what the predict output means. 问题是我不了解预测输出的含义。 At first I assumed values above 0.5 mean 1, else 0. But if that's the case the model doesn't even come close to rebuilding the original. 起初，我假设值大于0.5表示1，否则等于0。但是，如果是这种情况，该模型甚至不能接近重建原始模型。

I get that it won't be perfect, but it simply returns value above 0.5 for random link candidates. 我知道这并不完美，但是对于随机链接候选者，它仅返回大于0.5的值。

Can someone explain to me how exactly model.predict works and how to properly use it to rebuild my graph? 有人可以向我解释model.predict的工作原理以及如何正确地使用它来重建我的图形吗？

1 个解决方案

The model that you trained is not directly optimized wrt the graph reconstruction. 您训练的模型并未通过图重建直接优化。 Without loss of generality, for a N -node graph, you need to predict N choose 2 links. 不失一般性，对于N节点图，您需要预测N choose 2链接。 And it may be reasonable to assume that the true values of the most of these links are 0. 并且可以合理地假设大多数这些链接的真实值为0。

When looking into your model accuracy on the 0-class and 1-class, it is clear that your model is prone to predict 1-class, assuming your training data is balanced. 在研究0级和1级模型的准确性时，很明显，假设您的训练数据是平衡的，则您的模型很容易预测1级。 Therefore, your reconstructed graph contains many false alarm links. 因此，您的重建图包含许多错误警报链接。 This is the exact reason why the performance of your reconstruction graph is poor. 这就是重建图的性能差的确切原因。

If it is possible to retrain the model, I suggest you do it and use more negative samples. 如果有可能重新训练模型，建议您这样做，并使用更多的阴性样本。

If not, you need to consider applying some post-processing. 如果不是，则需要考虑应用一些后处理。 For example, instead of finding a threshold to decide which two nodes have a link, use the raw predicted link probabilities to form a node-to-node linkage matrix, and apply something like the minimum spanning tree to further decide what are appropriate links. 例如，与其寻找阈值来确定哪些两个节点具有链接，不如使用原始的预测链接概率来形成节点到节点的链接矩阵，并应用诸如最小生成树之类的东西来进一步确定什么是合适的链接。