简体   繁体   中英

Sklearn polynomial regression flat with datetime x vales

I am getting a flat regression even with a 10th degree regresor. But If I change the date vaues to numeric then the regression works? Anybody knows why?

from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LinearRegression
from scipy.optimize import curve_fit

## RESHAPE DATA ##
X = transformed_data.ds.values.reshape(-1, 1)
y = transformed_data.y
# X = data.fecha.dt.day.values.reshape(-1, 1)

## PLOT ##
fig, ax = plt.subplots(figsize=(15,8))
ax.plot(X, y, 'o', label="data")

for i in (range(1, 10)):
    polyreg = make_pipeline(PolynomialFeatures(i), LinearRegression())
    polyreg.fit(X, y)
    mse = round(np.mean((y - polyreg.predict(X))**2))
    mae = round(np.mean(abs(y - polyreg.predict(X))))
    ax.plot(X, polyreg.predict(X), label='Degree: ' + str(i) + ' MSE: ' + f'{mse:,}' +' MAE: ' + f'{mae:,}')
Datetime Data
    ds          y
0   2019-01-10  3658.0
1   2019-01-11  2952.0
2   2019-01-12  2855.0
3   2019-01-13  3904.0

Flat regressions

Numeric Data
    ds  y
0   10  3658.0
1   11  2952.0
2   12  2855.0
3   13  3904.0

Curved regressions

Linear Regression imply the associating of numerical values to a calculated coefficient. What happens next is that the values are multiplied by the coefficients, which in turn gives you an output which is used for predictions.

BUT, in your case, one of the variables is a date and, as explained above, the regression model doesn't know what to do with it. As you noticed, you need to convert them to numerical data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM