简体   繁体   English

使用神经网络的文本分类

[英]Text Classification Using Neural Network

I am new to machine learning and neural network. 我是机器学习和神经网络的新手。 I am trying to do text classification using neural network from scratch. 我正在尝试从头开始使用神经网络进行文本分类。 In my dataset there are 7500 documents each labeled with one of seven classes. 在我的数据集中,有7500个文档,每个文档都标有七个类别之一。 There are about 5800 unique words. 大约有5800个独特的单词。 I am using one hidden layer with 4000 neurons. 我正在使用具有4000个神经元的隐藏层。 Using sigmoid for activation function. 使用乙状结肠激活功能。 Learning rate=0.1,No dropout. 学习率= 0.1,无辍学。

During training After about 2 to 3 epochs,A warning is displayed : 训练期间大约2到3个时期后,将显示警告:

RuntimeWarning: overflow encountered in exp.The resultant output list appears as: RuntimeWarning:exp中遇到溢出,结果输出列表显示为:

[  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
   0.00000000e+00   0.00000000e+00   4.11701866e-10]  for every input except 4.11701866e-10.

sigmoid function: 乙状结肠功能:

def sigmoid(x):    
   output = 1/(1+np.exp(-x))
   return output

def sigmoid_output_to_derivative(output):
   return output*(1-output)

How to fix this?Can i use different activation function? 如何解决此问题?我可以使用其他激活功能吗?

Here is my full code: https://gist.github.com/coding37/a5705142fe1943b93a8cef4988b3ba5f 这是我的完整代码: https : //gist.github.com/coding37/a5705142fe1943b93a8cef4988b3ba5f

It is not that easy to give a precise answer, since the problems can be manifold and is very hard to reconstruct, but I'll give it a try: 给出准确的答案并不容易,因为问题可能是多方面的,并且很难重构,但我将尝试一下:

So it seems you're experiencing underflow, meaning that the weights of your neurons scale your input vector x to values that will lead to zero values in the sigmoid function. 因此,似乎您正在经历下溢,这意味着神经元的权重将输入向量x缩放为将导致S形函数为零的值。 A naive suggestion would be to increase the precision from float32 to float64, but I guess you are already at that precision. 一个幼稚的建议是将精度从float32提高到float64,但是我想您已经达到了这个精度。

Have you played around with the learning rate and/or tried an adaptive learning rate? 您是否曾尝试过学习率和/或尝试过自适应学习率? (See https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1 for some examples). (有关某些示例,请参见https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1 )。 Try more iterations with a lower learning rate for a start maybe. 可以尝试以较低的学习率进行更多迭代。

Also: Are you using sigmoid functions in your output-layer? 另外:您是否在输出层中使用Sigmoid函数? The added non-linearity could drive your neurons into saturation, ie your problem. 增加的非线性会驱动您的神经元饱和,即您的问题。

Have you checked your gradients? 您检查过渐变吗? This can also sometimes help in tracking down errors ( http://ufldl.stanford.edu/wiki/index.php/Gradient_checking_and_advanced_optimization ). 有时,这有时还有助于跟踪错误( http://ufldl.stanford.edu/wiki/index.php/Gradient_checking_and_advanced_optimization )。

Alternatively, you could try if your training improves by using other activation functions, eg linear for a start. 另外,您可以尝试使用其他激活功能(例如,线性启动)来提高自己的训练水平。

Since probabilities in machine learning tend to be very small and computations on them lead to even smaller values (leading to underflow errors), it is good practice to do your computations with logarithmic values . 由于机器学习的概率往往很小,并且对其进行的计算会导致更小的值(导致下溢错误),因此,最好使用对数值进行计算

Using float64 types isn't bad but will also fail eventually. 使用float64类型也不错,但最终也会失败。

So instead of ie multiplying two small probabilities you should add their logarithmic values. 因此,与其将两个小概率相乘,不如将其对数值相加。 Same goes for other operations like exp(). 其他操作(例如exp())也是如此。

Every machine learning framework I know either returns logarithmic model params by default or has a method for that. 我知道的每个机器学习框架默认都会返回对数模型参数,或者有一个方法。 Or you just use your built in math functions. 或者,您仅使用内置的数学函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM