简体   繁体   English

在Python错误中将卷积神经网络与千层面结合使用

[英]Using Convolution Neural Net with Lasagne in Python error

I have used the frame work provided by Daniel Nouri on his eponymous website. 我使用了Daniel Nouri在他的同名网站上提供的框架。 here is the code I used.It looks fine the only change I made is to change output_nonlinearity=lasagne.nonlinearities.softmax and regression to False.Otherwise it looks pretty straight forward 这是我使用的代码,看起来不错,我所做的唯一更改是将output_nonlinearity = lasagne.nonlinearities.softmax更改为False,否则看起来很简单

from lasagne import layers
import theano
from lasagne.updates import sgd,nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.metrics import classification_report
import lasagne
import cv2
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.datasets import fetch_mldata
import sys

mnist = fetch_mldata('MNIST original')
X = np.asarray(mnist.data, dtype='float32')
y = np.asarray(mnist.target, dtype='int32')

(trainX, testX, trainY, testY) = train_test_split(X,y,test_size =0.3,random_state=42)
trainX = trainX.reshape(-1, 1, 28, 28)
testX = testX.reshape(-1, 1, 28, 28)

clf = NeuralNet(
    layers=[
    ('input', layers.InputLayer),
    ('conv1', layers.Conv2DLayer),
    ('pool1', layers.MaxPool2DLayer),
    ('dropout1', layers.DropoutLayer),  # !
    ('conv2', layers.Conv2DLayer),
    ('pool2', layers.MaxPool2DLayer),
    ('dropout2', layers.DropoutLayer),  # !
    ('hidden4', layers.DenseLayer),
    ('dropout4', layers.DropoutLayer),  # !
    ('hidden5', layers.DenseLayer),
    ('output', layers.DenseLayer),
    ],
 input_shape=(None,1, 28, 28),
 conv1_num_filters=20, conv1_filter_size=(3, 3), pool1_pool_size=(2, 2),
 dropout1_p=0.1,  # !
 conv2_num_filters=50, conv2_filter_size=(3, 3), pool2_pool_size=(2, 2),
 dropout2_p=0.2,  # !
 hidden4_num_units=500,
 dropout4_p=0.5,  # !
 hidden5_num_units=500,

 output_num_units=10,

 output_nonlinearity=lasagne.nonlinearities.softmax,

 update=nesterov_momentum,

 update_learning_rate=theano.shared(float32(0.03)),
 update_momentum=theano.shared(float32(0.9)),

 regression=False,
 max_epochs=3000,
 verbose=1,
 )

clf.fit(trainX,trainY)

However on running it I get this NaN 但是在运行它我得到这个NaN

input               (None, 1, 28, 28)       produces     784 outputs
conv1               (None, 20, 26, 26)      produces   13520 outputs
pool1               (None, 20, 13, 13)      produces    3380 outputs
dropout1            (None, 20, 13, 13)      produces    3380 outputs
conv2               (None, 50, 11, 11)      produces    6050 outputs
pool2               (None, 50, 6, 6)        produces    1800 outputs
dropout2            (None, 50, 6, 6)        produces    1800 outputs
hidden4             (None, 500)             produces     500 outputs
dropout4            (None, 500)             produces     500 outputs
hidden5             (None, 500)             produces     500 outputs
output              (None, 10)              produces      10 outputs
epoch    train loss    valid loss    train/val    valid acc  dur
-------  ------------  ------------  -----------  -----------  ------
  1           nan           nan          nan      0.09923  16.18s
  2           nan           nan          nan      0.09923  16.45s

Thanks in advance. 提前致谢。

I'm very late to the game, but hopefully someone finds this answer useful! 我玩游戏已经很晚了,但是希望有人觉得这个答案有用!

In my experience, there could be a number of things going wrong here. 以我的经验,这里可能有很多错误。 I'll write out my steps for debugging this kind of problem in nolearn/lasagne: 我将在nolearn / lasagne中写出调试此类问题的步骤:

  1. Using Theano's fast_compile optimizer can lead to underflow issues, which result in the nan output (this was the ultimate problem in my case) 使用Theano的fast_compile优化器可能会导致下溢问题,从而导致nan输出(在我的情况下,这是最终的问题)

  2. When the output starts with nan values, or if nan values start appearing soon after training starts, the learning rate may be too high. 当输出以nan值开头时,或者如果nan值在训练开始后不久开始出现,则学习率可能会太高。 If it is 0.01 , try and make it 0.001 . 如果为0.01 ,请尝试使其为0.001

  3. The input or output values may be too close to one another, and you may want to try scaling them. 输入或输出值可能彼此之间太接近,您可能需要尝试缩放它们。 A standard approach is to scale the input by subtracting the mean and dividing by the standard deviation. 一种标准方法是通过减去平均值并除以标准偏差来缩放输入。

  4. Make sure you are using regression=True when using nolearn with a regression problem 使用nolearn解决回归问题时,请确保使用regression=True

  5. Try using a linear output instead of softmax. 尝试使用线性输出而不是softmax。 Other nonlinearities sometimes also help, but in my experience not often. 其他非线性有时也有帮助,但以我的经验来说并不经常。

  6. If all this fails, try and isolate whether the issue is with your network or with your data. 如果所有操作均失败,请尝试找出问题出在您的网络还是数据上。 If you feed in random values within the expected range and still get nan output, it's probably not specific to the dataset you are training on. 如果您输入预期范围内的随机值并仍然获得nan输出,则可能不是特定于您正在训练的数据集。

Hope that helps! 希望有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 千层面千层面错误 - Lasagne 1D Convolution Error 我在使用tensorflow的python中执行卷积神经网络程序期间遇到错误,错误是 - i am getting error during exection of convolution neural network program in python using tensorflow and the error is 神经网络在Lasagne中出现“ ValueError:形状不匹配”错误 - “ValueError: Shape mismatch” error in Lasagne for Neural Network 神经网络 Python 错误“无法获得卷积算法” - Neural Network Python Error "Failed to get convolution algorithm" AttributeError: 'dict' object has no attribute 'train' 尝试在 python 中使用 tensorflow 实现卷积神经网络程序时出现错误 - AttributeError: 'dict' object has no attribute 'train' error when trying to implement a convolution neural network program using tensorflow in python 尝试在 python 中使用 tensorflow 实现卷积神经网络程序时,我不断收到奇怪的“无效语法”错误 - I keep receiving a weird "invalid syntax" error when trying to implement a convolution neural network program using tensorflow in python 从Lasagne获取输出(Python深度神经网络框架) - Get output from Lasagne (python deep neural network framework) 反向传播python神经网络中的错误 - Error in backpropagation python neural net 卷积神经网络用于股票市场预测,回归 - Convolution Neural Net for Stock Market Prediction, Regression 千层面-错误 - Lasagne - error
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM