简体繁体 English

使用MNIST实现Logistic回归-无法收敛？

[英]Logistic Regression implementation with MNIST - not converging?

原文 2017-08-30 07:18:07 9 2 python/ numpy/ machine-learning/ logistic-regression/ mnist

I hope someone can help me. 我希望有一个人可以帮助我。 I did an implementation of logistic regression from scratch (so without library, except numpy in Python). 我从头开始执行了逻辑回归（因此没有库，但Python中的numpy除外）。

I used MNIST dataset as input, and decided to try (since I am doing binary classification) a test on only two digits: 1 and 2. My code can be found here 我使用MNIST数据集作为输入，并决定尝试（因为我正在进行二进制分类）仅对两位数字进行测试：1和2。可以在这里找到我的代码

https://github.com/michelucci/Logistic-Regression-Explained/blob/master/MNIST%20with%20Logistic%20Regression%20from%20scratch.ipynb https://github.com/michelucci/Logistic-Regression-Explained/blob/master/MNIST%20with%20Logistic%20Regression%20from%20scratch.ipynb

The notebook should run on any system that have the necessary library installed. 笔记本应可在装有必需库的任何系统上运行。

Somehow my cost function is not converging. 不知何故，我的成本函数没有收敛。 I am getting error since my A (my sigmoid) is getting equal to 1, since z is getting very big. 由于我的A（我的S形）等于1，因为z变得很大，所以我得到了错误。

I tried everything but I don't see my error. 我尝试了所有操作，但没有看到我的错误。 Can anyone give a look and let me know if I missed something obvious? 谁能看一下，让我知道我是否错过了明显的事情？ The point here is not getting a high accuracy. 这里的重点不是高精度。 Is getting the model to converge to something ;) 正在使模型收敛到某些东西;）

Thanks in advance, Umberto 预先感谢，翁贝托

2 个解决方案

I read your codes. 我读了你的密码。 All looks fine. 一切都很好。 Only thing is that your learning rate is high. 唯一的是您的学习率很高。 I know 0.005 is a small number but in this case its too high for the algorithm to converge. 我知道0.005是一个很小的数字，但是在这种情况下，它对于算法收敛来说太高了。 That is evident by the increase in cost. 成本增加证明了这一点。 The cost decreases for a while and then starts going negative very quickly. 成本下降了一段时间，然后很快就开始下降。 The idea is to have cost close to zero. 想法是使成本接近零。 Here negative numbers do not imply smaller cost. 在此，负数并不意味着较小的成本。 You have to see the magnitude. 您必须看到大小。 I used 0.000008 as the learning rate and it works fine. 我使用0.000008作为学习率，并且效果很好。

I got the error. 我得到了错误。 The problem was that I used as class labels 1 and 2 (the one you can find in MNIST), but in binary classification you compare those values with 0 and 1, so the model could not converge, since sigmoid() (see my code) can only go from 0 to 1 (is a probability). 问题是我将其用作类标签1和2（可以在MNIST中找到一个），但是在二进制分类中，您将这些值与0和1进行了比较，因此该模型无法收敛，因为sigmoid() （请参阅我的代码））只能从0到1（是一个概率）。

Using 0 and 1 instead of 1 and 2 solved the problem beatifully. 使用0和1代替1和2可以很好地解决此问题。 Now my model converges to 98% accuracy :-) 现在我的模型收敛到98％的精度:-)

Thanks everyone for helping! 谢谢大家的帮助！

Regards, Umberto 问候，翁贝托