简体   繁体   English

设置密集层以从一维数组中学习

[英]Setup dense layers to learn from 1D arrays

I have about 100k arrays of size 256 which I would like to input in a neural network composed by few dense layers, and to output 100k arrays of again size 256. (I would like my net to transform the input array into the output array).我有大约 100k 个大小为 256 的数组,我想将它们输入到由几个密集层组成的神经网络中,并输出 100k 个大小为 256 的数组。(我希望我的网络将输入数组转换为输出数组) . I cannot manage to set it up correctly.我无法正确设置它。

My X_train and y_train have shape (98304, 256) , my X_test and y_test (16384, 256) .我的X_trainy_train有形状(98304, 256) y_train (98304, 256) ,我的X_testy_test (16384, 256)

My network at the moment is我现在的网络是

model = Sequential()
model.add(Dense(1, input_shape=(256,), activation='relu'))
model.add(Dense(1024, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(1024, activation='relu'))
model.add(Dense(256, activation='linear'))

optimizer = Adam()
model.compile(optimizer=optimizer,loss='mean_squared_error',metrics=['accuracy', 'mae'])

The network actually runs, but it does not give any meaningful result.网络实际运行,但它没有给出任何有意义的结果。 It stops after 20 epochs because I give it the early stopping.它在 20 个 epoch 后停止,因为我给它提前停止。

Epoch 00019: val_loss did not improve from -inf
Epoch 20/200
6400/6400 [==============================] - 1s 232us/step - loss: nan - acc: 0.2511 - mean_absolute_error: nan - val_loss: nan - val_acc: 0.2000 - val_mean_absolute_error: nan

And if I try to use it to predict, I only get nan values (I do not have any nan in my training set).如果我尝试用它来预测,我只会得到 nan 值(我的训练集中没有任何 nan)。

Hope someone can help me with this.希望有人能帮我解决这个问题。 Thanks in advance.提前致谢。

Edit To check whether is a problem with the inputs or the algorithm, I have tried creating my inputs and targets using the following code编辑为了检查输入或算法是否有问题,我尝试使用以下代码创建我的输入和目标

X_train=[]
y_train=[]

for it in range(1000):
    beginning=random.uniform(0,1)
    end=random.uniform(0,1)
    X_train.append([beginning+(end-beginning)*jt/256 for jt in range(256)])
    y_train.append([end+(beginning-end)*jt/256 for jt in range(256)])
X_train=np.array(X_train)
y_train=np.array(y_train)

And I still get我仍然得到

Epoch 27/200
1000/1000 [==============================] - 0s 236us/step - loss: nan - acc: 0.4970 - mean_absolute_error: nan

Edit2 : If I increase the complexity of my network I manage to get a loss different from nan using the 10k training arrays created using the fuction above. Edit2 :如果我增加我的网络的复杂性,我设法使用使用上述功能创建的 10k 训练数组获得与 nan 不同的损失。 However, the results are still quite bad which makes me wonder I am not setting up the network correctly.然而,结果仍然很糟糕,这让我怀疑我没有正确设置网络。

The new network:新网络:

model = Sequential()
model.add(Dense(1, input_shape=(256,), activation='relu'))
model.add(Dense(2048, activation='relu'))
model.add(Dense(2048, activation='relu'))
model.add(Dense(2048, activation='relu'))
model.add(Dense(256, activation='linear'))

optimizer = Adam()
model.compile(optimizer=optimizer,loss='mean_squared_error',metrics=['mae'])

model.summary()

And the result when they converge当他们收敛时的结果

Epoch 33/200
10000/10000 [==============================] - 23s 2ms/step - loss: 0.0561 - mean_absolute_error: 0.2001 - val_loss: 0.0561 - val_mean_absolute_error: 0.2001

If I check the output of the network, I always obtain a vector with all points around 0.5 regardless of the input.如果我检查网络的输出,无论输入如何,我总是得到一个所有点都在 0.5 左右的向量。

预测示例

Also, if I try to predict a single vector using y_pred=model.predict(Xval[3]) I get the error另外,如果我尝试使用y_pred=model.predict(Xval[3])来预测单个向量, y_pred=model.predict(Xval[3])出现错误

ValueError: Error when checking : expected dense_27_input to have shape (256,) but got array with shape (1,)

Your first layer only has 1 output neuron, this seems wrong.你的第一层只有1输出神经元,这似乎是错误的。 It could be messing up your loss function.它可能会弄乱你的损失函数。 Try replacing model.add(Dense(1, input_shape=(256,), activation='relu')) with model.add(InputLayer(input_shape=(256,))) .尝试用model.add(Dense(1, input_shape=(256,), activation='relu')) model.add(InputLayer(input_shape=(256,)))替换model.add(Dense(1, input_shape=(256,), activation='relu')) model.add(InputLayer(input_shape=(256,)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM