我不确定我的代码有什么问题..（线性/多项式回归）

Question

I have a data set (csv file) with three seperate columns.我有一个包含三个单独列的数据集（csv 文件）。 Column 0 is the signal time, Column 1 is the frequency, and Column 2 is the intensity.第 0 列是信号时间，第 1 列是频率，第 2 列是强度。 The is alot of noise in the data that can be sorted though by finding the variance of each signal frequency.数据中有很多噪声，可以通过查找每个信号频率的方差来进行排序。 If it is <2332 then it is the right frequency.如果它是 <2332 那么它是正确的频率。 Hence, this would be the data I would want to calculate Linear/Poly regression on.因此，这将是我想要计算线性/多元回归的数据。 ps I have to calc linear manually:(. The nested for loop decision structure I have isn't currently working. Any solutions would be helpful! thanks ps我必须手动计算线性:(。我拥有的嵌套for循环决策结构目前不起作用。任何解决方案都会有所帮助！谢谢

data = csv.reader(file1)
sort = sorted(data, key=(operator.itemgetter(1))) #sorted by the frequencies
for row in sort:
x.append(float(row[0]))
y.append(float(row[2]))
frequencies.append(float(row[1]))

for i in range(499) : 
freq_dict.update({ frequencies[i] : [x[i], y[i]] })

for key in freq_dict.items(): 
   for row in sort : 
       if key == float(row[1]):
           a.append(float(row[1]))
           b.append(float(row[2]))
           c.append(float(row[0]))
       else :
           num = np.var(a)
           if num < 2332.0: 
               linearRegression(c, b, linear)
               print('yo')
               polyRegression(c, b, d, linear, py)
               mplot.plot(linear, py)
           else: 
               a = [] 
               b = [] 
               c = []

variances of 2332 or less are the frequencies I need 2332 或更小的方差是我需要的频率

I used range of 499 because that is the length of my data set.我使用了 499 的范围，因为这是我的数据集的长度。 Also, I tried to clear the lists (a,b,c) if the frequency wasn't correct.此外，如果频率不正确，我尝试清除列表 (a,b,c)。

Answer 1

There are several issues I see going on.我看到有几个问题正在发生。 I am unsure why you sort your data, if you all ready know the exact values you are looking for.如果您都准备好知道您要查找的确切值，我不确定您为什么对数据进行排序。 I am unsure why you split up the data into separate variables as well.我不确定您为什么还要将数据拆分为单独的变量。 The double "for" loops means that you are repeating everything in "sort" for every single key in freq_dict.双“for”循环意味着您正在为 freq_dict 中的每个键重复“排序”中的所有内容。 Not sure if that was your intention to repeat all those values multiple times.不确定您是否打算多次重复所有这些值。 Also, freq_dict.items() produces tuples (key,value pairs), so your "key" is a tuple, hence "key" will never equal a float.此外， freq_dict.items() 产生元组（键，值对），所以你的“键”是一个元组，因此“键”永远不会等于一个浮点数。 Anyway, here is an attempt to re-write some code.无论如何，这里是重新编写一些代码的尝试。

import csv, numpy
import matplotlib.pyplot as plt
from scipy import stats

data   = csv.reader(file1)                    #Read file. 
f_data = filter(lambda (x,f,y):f<2332.0,data) #Filter data to condition. 
x,_,y  = list(zip(*f_data))                   #Split data down column. 

#Standard linear stats function. 
slope,intercept,r_value,p_value,std_err = stats.linregress(x,y)

#Plot the data and the fit line. 
plt.scatter(x,y)
plt.plot(x,numpy.array(x)*slope+intercept)
plt.show()

Answer 2

A more similar solution was using the corrcoef of the list.一个更类似的解决方案是使用列表的 corrcoef。 But in similar style it was as follows:但类似的风格如下：

for key, value in freq_dict.items(): #1487
for row in sort:   #when row -> goes to a new freq it calculates corrcoef of an empty list.
    if key == float(row[1]): #1487
        a.append(float(row[2]))
        b.append(float(row[0])) 
    elif key != float(row[1]): 
        if a: 
            num = np.corrcoef(b, a)[0,1]
            if (num < somenumber).any(): 
                do stuff
        a = [] #clear the lists and reset number
        b = [] 
        num = 0

我不确定我的代码有什么问题..（线性/多项式回归）

问题描述

variances of 2332 or less are the frequencies I need 2332 或更小的方差是我需要的频率

variances of 2332 or less are the frequencies I need 2332 或更小的方差是我需要的频率

2 个解决方案

解决方案1
0 2020-04-24 07:10:10

解决方案2
0 2020-04-25 21:17:13

我不确定我的代码有什么问题..（线性/多项式回归）

问题描述

variances of 2332 or less are the frequencies I need 2332 或更小的方差是我需要的频率

variances of 2332 or less are the frequencies I need 2332 或更小的方差是我需要的频率

2 个解决方案

解决方案1 0 2020-04-24 07:10:10

解决方案2 0 2020-04-25 21:17:13

解决方案1
0 2020-04-24 07:10:10

解决方案2
0 2020-04-25 21:17:13