简体   繁体   English

如何正确预测 python 中的近期值?

[英]How do I predict the near future value correctly in python?

I need help, I m currently deploying my LSTM model in flask python, I m trying to load my result to new csv file, but eventually, it loaded with the repeated result, so I have no idea which line of code was doing wrong, Please adjust me and give me some tips Thanks a lots! I need help, I m currently deploying my LSTM model in flask python, I m trying to load my result to new csv file, but eventually, it loaded with the repeated result, so I have no idea which line of code was doing wrong,请调整我并给我一些提示非常感谢!

model.py model.py

import numpy as np
from math import sqrt
from numpy import concatenate
from matplotlib import pyplot
from pandas import read_csv
from pandas import DataFrame
from pandas import concat
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from pickle import dump




def create_dataset(dataset, look_back=1):
    dataX, dataY = [], []
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back), 0]
        dataX.append(a)
        dataY.append(dataset[i + look_back, 0])
    return np.array(dataX), np.array(dataY)
    
# load dataset
np.random.seed(7)
# load the dataset
dataframe = read_csv('Sales.csv', usecols=[1], engine='python', skipfooter=3)
dataset = dataframe.values
dataset = dataset.astype('float32')
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
# reshape into X=t and Y=t+1
look_back = 1
train_X, train_Y = create_dataset(train, look_back)
test_X, test_Y = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))



model = Sequential()
model.add(LSTM(128, return_sequences=True ,input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(64))


model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
history = model.fit(train_X, train_Y, epochs=100, batch_size=128, validation_data=(test_X, test_Y), verbose=2, shuffle=False)


#save the model
model.save('model.h5')

app.py应用程序.py

from flask import Flask, make_response, request, render_template
from pandas import DataFrame
import io
from pandas import datetime
from io import StringIO
import csv
import pandas as pd
import numpy as np
import pickle
import os
from keras.models import load_model
from sklearn.preprocessing import MinMaxScaler
import datetime
from datetime import timedelta, datetime
from dateutil.relativedelta import relativedelta

app = Flask(__name__)

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Let's TRY to Predict..</h1>
                </br>
                </br>
                <p> Insert your CSV file and then download the Result
                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" class="btn btn-block"/>
                    </br>
                    </br>
                    <button type="submit" class="btn btn-primary btn-block btn-large">Predict</button>
                </form>

                 <div class="ct-chart ct-perfect-fourth"></div>

            </body>
        </html>
    """

@app.route('/transform', methods=["POST"])
def transform_view():
 if request.method == 'POST':
    f = request.files['data_file']
    if not f:
        return "No file"

    
    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)
    csv_input = csv.reader(stream)
    stream.seek(0)
    result = stream.read()
    df = pd.read_csv(StringIO(result), usecols=[1])
    
    #extract month value
    df2 = pd.read_csv(StringIO(result))
    matrix2 = df2[df2.columns[0]].to_numpy()
    list1 = matrix2.tolist()
     
    # load the model from disk
    model = load_model('model.h5')
    dataset = df.values
    dataset = dataset.astype('float32')
    scaler = MinMaxScaler(feature_range=(0, 1))
    dataset = scaler.fit_transform(dataset)
    dataset = np.reshape(dataset, (dataset.shape[0], 1, dataset.shape[1]))
    predict = model.predict(dataset)
    transform = scaler.inverse_transform(predict)

    X_FUTURE = 100
    transform = np.array([])
    last = dataset[-1]
    for i in range(X_FUTURE):
        curr_prediction = model.predict(np.array([last]))
        last = np.concatenate([last[1:], curr_prediction])
        transform = np.concatenate([transform, curr_prediction[0]])
        
    transform = scaler.inverse_transform([transform])[0]

    dicts = []
    curr_date = pd.to_datetime(list1[-1])
    for i in range(X_FUTURE):
        curr_date = curr_date +  relativedelta(month=1)
        dicts.append({'Predictions':transform[i], "Month": curr_date})


    new_data = pd.DataFrame(dicts).set_index("Month")
    ##df_predict = pd.DataFrame(transform, columns=["predicted value"])
          

    response = make_response(new_data.to_csv(index = True, encoding='utf8'))
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"
    return response

if __name__ == "__main__":
    app.run(debug=True, port = 9000, host = "localhost")

This is the result that loaded to the new csv file这是加载到新 csv 文件的结果

在此处输入图像描述

I think it is the case that you have correct results (meaning duplicates), your LSTM is trained correctly (but maybe with low accuracy), and duplicates are not a mistake but correct answer.我认为您的结果正确(即重复),您的 LSTM 训练正确(但准确率可能较低),重复不是错误,而是正确答案。

Regarding duplicate Month column values - the reason is that Pandas can't recognize relativedelta from dateutil package hence adding it to date gives wrong result.关于重复的 Month 列值 - 原因是 Pandas 无法识别dateutil package 中的relativedelta ,因此将其添加到 date 会产生错误的结果。 Instead try doing this curr_date = curr_date + pd.DateOffset(months = 1) , this will produces correct different dates in your Month column.而是尝试这样做curr_date = curr_date + pd.DateOffset(months = 1) ,这将在您的 Month 列中生成正确的不同日期。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 RandomForestRegressor 方法在 Python 中使用 scikitlearn、pandas 预测未来结果? - How do I predict future results with scikitlearn, pandas in Python using RandomForestRegressor method? 我如何正确预测湿度值? - How do i correctly predict the humidity values? 如何获得预测未来跟随(下一个)值? - how can I get the predict future following(next) value? 如何将新的 csv 文件数据添加到训练 LSTM 模型中以使用 python 预测下一个未来值 - How to add new csv file data into training LSTM model to predict next future value using python sklearn,线性回归 - 如何预测输入 dataframe 中的测试数据之外的未来年份的人口? - sklearn, linear regression - How do I predict population to a future year that is outside of the test data in the input dataframe? 您如何使用我在下面构建的 LSTM-RNN model 预测未来值? - How do you predict future values with this LSTM-RNN model I've built below? 如何在训练和测试后预测股票未来的收盘价? - How do I predict the future closing price of stock after training and testing? 如何使用 Tensorflow LSTM 获得预测的未来跟随值? - How can I get the predict future following value using Tensorflow LSTM? 您如何使用 LSTM model 预测未来的预测? - How do you predict future predictions with an LSTM model? 如何在 LSTM 中添加额外信息来预测未来价值? - How to add additional information into LSTM to predict future value?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM