简体   繁体   English

线性回归程序的问题

[英]Problems with Linear Regression Program

I'm trying to use a linear regression program to predict handwritten numbers using the mnist dataset. 我正在尝试使用线性回归程序使用mnist数据集预测手写数字。 Whenever I have tried running it, the gradient descent function always takes a while to work and it is taking a long time to approach the correct weights. 每当我尝试运行它时,梯度下降功能通常都需要花费一些时间才能工作,并且需要很长时间才能达到正确的权重。 In eight hours it has gone through the function 550 times and there is still a lot of error. 在八个小时内,它已通过该功能550次,仍然存在很多错误。 Can someone tell me if it normally takes this long, or if I am doing something wrong. 有人可以告诉我通常需要这么长时间还是我做错了什么。

import numpy as np
import pandas as pd

mnist = pd.read_csv('mnist_train.csv')[:4200]
x = np.array(mnist)[:4200,1:]
y = np.array(mnist)[:4200,0].reshape(4200,1)

#How many numbers in dataset
n = len(x)
#How many values in each number
n1 = len(x[0])

#sets all weights equal to 1
coef = np.array([1 for i in range(n1)])

epochs = 1000000000000
learning_rate = .000000000008999
for i in range(epochs):
    cur_y = sum(x*coef)
    error = y-cur_y
    #Calculates Gradient
    grad = (np.array([sum(sum([-2/n  * (error)* x[j,i] for j in range(n)])) for i in range(n1)]))
    #Updates Weights
    coef = (-learning_rate * grad) + coef
    print(i)
    print(sum(y-(x*coef)))

Your learning rate is extremely tiny. 您的学习率非常小。 Also, 784 is a lot of dimensions for linear regression to tackle, especially assuming you're using all 60,000 samples. 此外,784有很多维数可供线性回归处理,尤其是假设您要使用所有60,000个样本时。 An SVM would work better and obviously, a CNN would be best. SVM会更好,而CNN最好。

Given your error is getting smaller I would recommend increasing your learning rate and train using stochastic gradients (grabbing random batches from your training set for each epoch instead of the whole training set). 鉴于您的误差越来越小,我建议您提高学习率并使用随机梯度进行训练(从训练集中为每个时期(而不是整个训练集)抓取随机批次)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM