Keras 神经网络为所有输入预测相同的数字

Question

I am trying to create a keras neural network to predict distance on roads between two points in city.我正在尝试创建一个 keras 神经网络来预测城市两点之间道路上的距离。 I am using Google Maps to get travel distance and then train neural network to do that.我正在使用谷歌地图来获取旅行距离，然后训练神经网络来做到这一点。

import pandas as pd
arr=[]
for i in range(0,100):
    arr.append(generateTwoPoints(55.901819,37.344735,55.589537,37.832254))
    df=pd.DataFrame(arr,columns=['p1Lat','p1Lon','p2Lat','p2Lon', 'distnaceInMeters', 'timeInSeconds'])
print(df)

Neural network architecture:神经网络架构：

from keras.optimizers import SGD
sgd = SGD(lr=0.00000001)
from keras.models import Sequential
from keras.layers import Dense, Activation
model = Sequential()
model.add(Dense(100, input_dim=4 , activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='mse', optimizer='sgd', metrics=['mse'])

Then i divide sets to test/train然后我将集划分为测试/训练

Xtrain=train[['p1Lat','p1Lon','p2Lat','p2Lon']]/100
Ytrain=train[['distnaceInMeters']]/100000
Xtest=test[['p1Lat','p1Lon','p2Lat','p2Lon']]/100
Ytest=test[['distnaceInMeters']]/100000

Then i fit data into the model, but loss stays the same:然后我将数据拟合到模型中，但损失保持不变：

history = model.fit(Xtrain, Ytrain,
                    batch_size=1,
                    epochs=1000,
                    # We pass some validation for
                    # monitoring validation loss and metrics
                    # at the end of each epoch
                    validation_data=(Xtest, Ytest))

I later print the data:我后来打印数据：

prediction = model.predict(Xtest)
print(prediction)
print (Ytest)

But result is the same for all the inputs:但是所有输入的结果都是一样的：

[[0.26150784]
 [0.26171574]
 [0.2617755 ]
 [0.2615582 ]
 [0.26173398]
 [0.26166356]
 [0.26185763]
 [0.26188275]
 [0.2614446 ]
 [0.2616575 ]
 [0.26175532]
 [0.2615183 ]
 [0.2618127 ]]
    distnaceInMeters
2            0.13595
6            0.27998
7            0.48849
16           0.36553
21           0.37910
22           0.40176
33           0.09173
39           0.24542
53           0.04216
55           0.38212
62           0.39972
64           0.29153
87           0.08788

I can not find the problem.我找不到问题所在。 What is it?它是什么？ I am new to machine learning.我是机器学习的新手。

Answer 1

You are doing a very elementary mistake: since you are in a regression setting, you should not use a sigmoid activation for your final layer (this is used for binary classification cases);你正在做一个很基本的错误：因为你是一个回归的设置，你不应该使用sigmoid激活你的最后一层（这是用于二元分类的情况下）; change your last layer to将最后一层更改为

model.add(Dense(1,activation='linear'))

or even甚至

model.add(Dense(1))

since, according to the docs , if you do not specify the activation argument it defaults to linear .因为，根据docs ，如果您没有指定activation参数，它默认为linear 。

Various other advice offered already in the other answer and the comments may be useful (lower LR, more layers, other optimizers eg Adam ), and you certainly need to increase your batch size;其他答案中已经提供的各种其他建议和评论可能有用（低 LR，更多层，其他优化器，例如Adam ），并且您当然需要增加批量大小； but nothing will work with the sigmoid activation function you currently use for your last layer.但是对于您当前用于最后一层的sigmoid激活函数，没有任何作用。

Irrelevant to the issue, but in regression settings you don't need to repeat your loss function as a metric;与问题无关，但在回归设置中，您不需要将损失函数作为指标重复； this这个

model.compile(loss='mse', optimizer='sgd')

will suffice.就足够了。

Answer 2

It would be very useful if you could post the the progression of the loss and MSE (of both the training and validation/test set) as it goes throughout the training.如果您可以在整个训练过程中发布损失和 MSE（训练集和验证/测试集）的进展，这将非常有用。 Even better, it would be best if you can visualize it as per https://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/ and post the vizualization here.更好的是，如果您可以按照https://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/将其可视化并在此处发布可视化，那将是最好的。

In the meantime, based on the facts: 1) You say the loss isn't decreasing (I'm assuming on the training set, during training, based on your compile args).同时，基于事实：1）您说损失没有减少（我假设在训练集上，在训练期间，基于您的编译参数）。 2) You say that the prediction "accuracy" on your test set is bad. 2）您说您的测试集上的预测“准确性”很差。 3) My experience/intuition (not an empirical assessment) tells me that your two layer dense model is a little too small to be able to capture the complexity inherent in your data. 3）我的经验/直觉（不是经验评估）告诉我，您的两层密集模型有点太小，无法捕捉数据中固有的复杂性。 AKA your model is suffering from too high a Bias https://towardsdatascience.com/understanding-the-bias-variance-tradeoff-165e6942b229又名您的模型存在过高的偏差https://towardsdatascience.com/understanding-the-bias-variance-tradeoff-165e6942b229

The fastest and easiest thing you can try, is to try to add both more layers and more nodes to each layer.您可以尝试的最快和最简单的方法是尝试向每一层添加更多层和更多节点。

However, I should note that there is a lot of causal information that can affect the driving distance and driving time beyond just the the distance between two coordinates, which might be the feature that your Neural network will most readily extract.但是，我应该注意到，除了两个坐标之间的距离之外，还有很多因果信息会影响行驶距离和行驶时间，这可能是您的神经网络最容易提取的特征。 For example, whether you drive on a highway or sides treets, traffic lights, whehter the roads twist and turn or go straight... to infer all of that just from that the data you will need enormous amounts of data(examples) in my opinion.例如，无论您是在高速公路上行驶还是在两侧的小树上行驶，交通灯，道路是曲折还是直行……仅从这些数据中推断出所有这些，您将需要大量数据（示例）在我的观点。 If you could add input columns with eg disatance to nearest higway from both points, you might be able to train with less data如果您可以添加输入列，例如距离两个点最近的高速公路的距离，您可能能够用更少的数据进行训练

I would also reccomend that you souble check that you are feeding as input what you think you are feeding (and its shape), and also, you should use some standardization from function sklearn which might help the model learn faster and converge faster to a higher "accuracy".我还建议您仔细检查您是否将您认为正在喂食的东西（及其形状）作为输入喂食，此外，您应该使用函数sklearn 的一些标准化，这可能有助于模型更快地学习并更快地收敛到更高的“准确性”。

If and when you post either more code or the training history I can help you more (and also how many training samples).如果您发布更多代码或培训历史记录，我可以为您提供更多帮助（以及多少培训样本）。

EDIT 1: Try changing batch size to a larger number preferably batch_size=32 if it fits in your memory.编辑 1：如果它适合您的记忆，请尝试将批量大小更改为更大的数字，最好是batch_size=32 。 you can use a small batch size (such as 1) when working with an "info rich" input like an image, but when using a very "info poor" datum like 4 floats (2 coordinates), the gradient will point each batch (with batch_size=1 ) to a practically random (pseudo...) direction and not neccessarily get any closer to a local minimum.在处理像图像这样的“信息丰富”输入时，您可以使用小批量（例如 1），但是当使用非常“信息贫乏”的数据（如 4 个浮点数（2 个坐标））时，渐变将指向每个批次（使用batch_size=1 ) 到几乎随机的（伪...）方向，而不必更接近局部最小值。 Only when taking the gradient on the collective loss of a larger batch (like 32, and perhaps more) will you get a gradient that points at least approximately in the direction of the local minimum and converge to a better result.只有在对较大批次（例如 32 甚至更多）的集体损失进行梯度计算时，您才会得到至少近似指向局部最小值方向的梯度并收敛到更好的结果。 Also, I suggest that you don't mess with the learning rate manually and perhaps change to an optimizer like "adam" or "RMSProp".另外，我建议您不要手动调整学习率，可能会更改为“adam”或“RMSProp”之类的优化器。

Edit 2 : @Desertnaut made an excellent point that I totally missed, a correction without which, your code will not work properly.编辑 2 ：@Desertnaut 提出了我完全错过的一个很好的观点，如果没有更正，您的代码将无法正常工作。 He deserves the credit so I will not include it here.他值得称赞，所以我不会在这里包括它。 Please refer to his answer.请参考他的回答。 Also, don't forget to raise your batch size, and not "manually mess" with your learning rate, "adam" for example, will do it for you.另外，不要忘记提高您的批量大小，不要“手动弄乱”您的学习率，例如“adam”会为您做到这一点。

Keras 神经网络为所有输入预测相同的数字

问题描述

2 个解决方案

解决方案1
1 2020-02-05 11:57:29

解决方案2
0 2020-02-04 20:21:05

Keras 神经网络为所有输入预测相同的数字

问题描述

2 个解决方案

解决方案1 1 2020-02-05 11:57:29

解决方案2 0 2020-02-04 20:21:05

解决方案1
1 2020-02-05 11:57:29

解决方案2
0 2020-02-04 20:21:05