简体   繁体   中英

Algorithm to detect a linear part of a data set

I hope I'm in the right place to ask this but my problem is the following : I have a set of data (two lists x and y), I dont have much other information about this set (no function, or anything like this). My goal is to find the subset of this data that is linear (the part highlighted in yellow in the image below).

在此处输入图片说明

As you can see on the image after plotting the data we can see that it becomes linear for a while. and I want to detect that subset automatically. since I dont have the function behind it I'm really lost!!

Anyone has an idea on how this can be done ? an algorithm or a mathematical method I can implement ? (Im using python btw)

You could start by determining the slope of two points by using their x and y values.

Say points 1 and 2 and the slope = 2. And then calculate the slope of points 2 and 3. If the slope of the latter is different than the former then you know it's not linear.

Just do a for loop through the whole data set and compare the current value with the next value to get the slope.

from decimal import Decimal
def linear_equation(p1,p2):
    #points are arrays like p=(x,y)
    m=slope(p1,p2) #slope 
    c=(p2[1]-(m*p2[0])) #y-intercept of line
    return 'y='+str(m)+ 'x' +'+' +str(c)

def slope(p1,p2):
    return Decimal((p2[1]-p1[1]))/Decimal(p2[0]-p1[0])

points =[[0,0],[1,1],[2,2],[3,4],[4,5],[5,6],[7,30],[8,35],[9,39]]


for p in range(0,len(points)-2):
    #if the slopes of points (a,b) and (b,c) are the same then print the equation
    #you could really omit the if statment if you just want to calculate the
    #equations for each set of points and do the comparasons later.
    #change the for condition to -1 instead of -2 if this is the case.
    if slope(points[p],points[p+1]) == slope(points[p+1],points[p+2]):
        print(str(lin_equ(points[p],points[p+1])))
    else:
        print("Non-Linear")

Output:

y=1x+0

Non-Linear

Non-Linear

y=1x+1

Non-Linear

Non-Linear

Non-Linear

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM