简体   繁体   English

使用for循环的简单线性回归仅适用于Python

[英]Simple Linear regression using for loop only with Python

I am now doing a linear regression analysis. 我现在正在进行线性回归分析。 The input variable is Size. 输入变量是Size。 The output variable is Price. 输出变量是Price。 I store the set of data in 2D array. 我将数据集存储在2D数组中。 I know using NumPy is easy to conduct analysis but my professors told me to use for loop only to perform iterations. 我知道使用NumPy很容易进行分析,但我的教授告诉我只使用for循环来执行迭代。 Formula of interation is shown as the picture in the hyperlink . 交互式公式显示为超链接中的图片 Therefore, I decided the following code to perform the calculation: 因此,我决定使用以下代码来执行计算:

#Structure of array (Stored in float), with structure like this [Room, Price]
array = [[4.0, 399.9], [5.0, 329.9], [6.0, 369.0]]

#Set initial value
theta_price = 0
theta_room = 0
stepsize = 0.01
item = 3

#Perform iterations
for looping in range(0, 50): #Loop 50 times
    for j in array[0]: #Call the value stored in array[0]
        for k in array[1]: #Call the value stored in array[1]
             theta_price_1 = theta_price - stepsize * (1 / item) * (sum((theta_price + theta_room * int(j) - int(k)))#Perform iterations of theta 0
             theta_room_1 = theta_room - stepsize * (1 / item) * (sum((theta_price + t + theta_room * int(j) - int(k))*int(j)))#Perform iterations of theta 1
             #Bring the new theta value to the next loop
             theta_price = theta_price_1
             theta_room = theta_room_1
             print(theta_price,theta_room)#Print the result for every loop

The above code was not function with error message at line 10 that: 以上代码无效,第10行的错误消息表明:

'int' object is not iterable

But if I remove the sum function, it works with incorrect calculation results. 但是,如果我删除sum函数,它将使用不正确的计算结果。 Therefore, I know it has some problems with the sum function and array but I don't know how to solve it? 因此,我知道sum函数和数组有一些问题,但我不知道如何解决它?

As I metioned in the comment, sum should be applied over all elements in each iteration, that's what Batch Gradient Descent does. 正如我在评论中提到的那样, sum应该应用于每次迭代中的所有元素,这就是Batch Gradient Descent所做的。 So the code should be: 所以代码应该是:

theta_price = 0
theta_room = 0
stepsize = 0.1
item = 5
#Perform iterations
array = [
            [0,1,2,3,4],
            [5,6,7,8,9],
        ]

for looping in range(0, 500): #Loop 50 times
         theta_price = theta_price - stepsize * (1 / item) * (sum([theta_price + theta_room * int(j) - int(k) for j, k in zip(array[0], array[1])]))#Perform iterations of theta 0
         theta_room = theta_room - stepsize * (1 / item) * (sum([(theta_price + theta_room * int(j) - int(k)) * int(j) for j, k in zip(array[0], array[1])]))#Perform iterations of theta 1
         print(theta_price,theta_room)#Print the result for every loop

after 500 iterations with the 5 test data, I can get the result: 在使用5个测试数据进行500次迭代后,我可以得到结果:

4.999999614653767 1.0000001313279816

which are expected. 这是预期的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM