python中的分段線性回歸

Question

python中有一個庫可以做分段線性回歸嗎？ 我想自動將多行適合我的數據以獲得如下內容： 分段回歸

順便提一句。 我知道段數。

Answer 1

可能可以使用 Numpy 的numpy.piecewise()工具。
這里顯示了更詳細的描述：如何在 Python 中應用分段線性擬合？

如果這不是所需要的，那么您可能會在這些問題中找到一些有用的信息： https : //datascience.stackexchange.com/questions/8266/is-there-a-library-that-would-perform-segmented- python中的線性回歸

和這里：
https://datascience.stackexchange.com/questions/8457/python-library-for-segmented-regression-aka-piecewise-regression

Answer 2

正如上面的評論中提到的，分段線性回歸帶來了許多自由參數的問題。 因此，我決定放棄使用 n_segments * 3 - 1 個參數（即 n_segments - 1 段位置、n_segment y-offests、n_segment 斜率）並執行數值優化的方法。 相反，我尋找已經具有大致恆定斜率的區域。

算法

計算所有點的斜率
與線段具有相似斜率的聚類點（由 DecisionTree 完成）
對上一步中找到的段執行線性回歸

使用決策樹而不是聚類算法來獲取連接的段而不是（非相鄰）點的集合。 分割的細節可以通過決策樹參數（當前為max_leaf_nodes ）進行調整。

代碼

import numpy as np
import matplotlib.pylab as plt
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import LinearRegression

# parameters for setup
n_data = 20

# segmented linear regression parameters
n_seg = 3

np.random.seed(0)
fig, (ax0, ax1) = plt.subplots(1, 2)

# example 1
#xs = np.sort(np.random.rand(n_data))
#ys = np.random.rand(n_data) * .3 + np.tanh(5* (xs -.5))

# example 2
xs = np.linspace(-1, 1, 20)
ys = np.random.rand(n_data) * .3 + np.tanh(3*xs)

dys = np.gradient(ys, xs)

rgr = DecisionTreeRegressor(max_leaf_nodes=n_seg)
rgr.fit(xs.reshape(-1, 1), dys.reshape(-1, 1))
dys_dt = rgr.predict(xs.reshape(-1, 1)).flatten()

ys_sl = np.ones(len(xs)) * np.nan
for y in np.unique(dys_dt):
    msk = dys_dt == y
    lin_reg = LinearRegression()
    lin_reg.fit(xs[msk].reshape(-1, 1), ys[msk].reshape(-1, 1))
    ys_sl[msk] = lin_reg.predict(xs[msk].reshape(-1, 1)).flatten()
    ax0.plot([xs[msk][0], xs[msk][-1]],
             [ys_sl[msk][0], ys_sl[msk][-1]],
             color='r', zorder=1)

ax0.set_title('values')
ax0.scatter(xs, ys, label='data')
ax0.scatter(xs, ys_sl, s=3**2, label='seg lin reg', color='g', zorder=5)
ax0.legend()

ax1.set_title('slope')
ax1.scatter(xs, dys, label='data')
ax1.scatter(xs, dys_dt, label='DecisionTree', s=2**2)
ax1.legend()

plt.show()

Answer 3

您只需要按升序排列 X 並創建幾個線性回歸。 您可以使用 sklearn 中的 LinearRegression。

例如，將曲線分成 2 將是這樣的：

from sklearn.linear_model import LinearRegression
import numpy as np
import matplotlib.pyplot as plt
X = np.array([-5,-4,-3,-2,-1,0,1,2,3,4,5])
Y = X**2
X=X.reshape(-1,1)
reg1 = LinearRegression().fit(X[0:6,:], Y[0:6])
reg2 = LinearRegression().fit(X[6:,:], Y[6:])

fig = plt.figure('Plot Data + Regression')
ax1 = fig.add_subplot(111)
ax1.plot(X, Y, marker='x', c='b', label='data')
ax1.plot(X[0:6,],reg1.predict(X[0:6,]), marker='o',c='g', label='linear r.')
ax1.plot(X[6:,],reg2.predict(X[6:,]), marker='o',c='g', label='linear r.')
ax1.set_title('Data vs Regression')
ax1.legend(loc=2)
plt.show()

我做了一個類似的實現，這里是代碼： https : //github.com/mavaladezt/Segmented-Algorithm

Answer 4

有一個piecewise-regression python 庫可以做到這一點。 Github 鏈接。

帶有 1 個斷點的簡單示例。 為了演示，首先生成一些示例數據：

import numpy as np

alpha_1 = -4
alpha_2 = -2
constant = 100
breakpoint_1 = 7
n_points = 200
np.random.seed(0)
xx = np.linspace(0, 20, n_points)
yy = constant + alpha_1*xx + (alpha_2-alpha_1) * np.maximum(xx - breakpoint_1, 0) + np.random.normal(size=n_points)

然后擬合分段模型：

import piecewise_regression
pw_fit = piecewise_regression.Fit(xx, yy, n_breakpoints=1)
pw_fit.summary()

並繪制它：

import matplotlib.pyplot as plt
pw_fit.plot()
plt.show()

示例 2-4 斷點。 現在讓我們看一些與原始問題類似的數據，有4個斷點。

import numpy as np

gradients = [0,2,1,2,-1,0]
constant = 0
breakpoints = [-4, -2, 1, 4] 
n_points = 200
np.random.seed(0)
xx = np.linspace(-10, 10, n_points)
yy = constant + gradients[0]*xx + np.random.normal(size=n_points)*0.5
for bp_n in range(len(breakpoints)):
    yy += (gradients[bp_n+1] - gradients[bp_n]) * np.maximum(xx - breakpoints[bp_n], 0)

擬合模型並繪制它：

import piecewise_regression
import matplotlib.pyplot as plt

pw_fit = piecewise_regression.Fit(xx, yy, n_breakpoints=4)

pw_fit.plot()

plt.xlabel("x")
plt.ylabel("y")
plt.ylim(-10, 20)
plt.show()

此Google Colab 筆記本中的代碼示例

python中的分段線性回歸

問題描述

4 個解決方案

解決方案1
4 2016-01-19 13:43:04

解決方案2
2 2020-05-11 15:11:18

解決方案3
0 2020-05-01 19:54:26

解決方案4
0 2021-12-06 10:48:32

python中的分段線性回歸

問題描述

4 個解決方案

解決方案1 4 2016-01-19 13:43:04

解決方案2 2 2020-05-11 15:11:18

解決方案3 0 2020-05-01 19:54:26

解決方案4 0 2021-12-06 10:48:32

解決方案1
4 2016-01-19 13:43:04

解決方案2
2 2020-05-11 15:11:18

解決方案3
0 2020-05-01 19:54:26

解決方案4
0 2021-12-06 10:48:32