簡體   English   中英

神經網絡的預測和測試損失小於0.001,但進行預測時的准確度為0%

[英]Neural network has <0.001 validation and testing loss but 0% accuracy when doing a prediction

我一直在訓練MLP,以預測裝配順序上剩余的時間。 訓練損失,驗證損失和MSE都小於0.001,但是,當我嘗試使用我訓練的網絡中的一個數據集進行預測時,它無法從輸入集合中正確識別任何輸出。 發生此錯誤的我在做什么錯?

我也在努力理解如何在部署模型時如何對一個預測進行結果縮放? scaler.inverse_transform無法使用,因為在訓練過程中使用的那個縮放器的數據已經丟失,因為使用訓練生成的模型將在與訓練分開的腳本中完成預測。 此信息是否保存在模型構建器中?

我試圖在訓練過程中更改批處理大小,將數據集的時間列舍入到最接近的秒數(以前是0.1秒),經過50、100和200個時期的訓練,而我總是最終沒有正確的預測。 我也在培訓LSTM,以查看哪種方法更准確,但也存在相同的問題。 將數據集拆分為70-30訓練測試,然后將訓練拆分為75-25進行訓練和驗證。

數據縮放和模型訓練代碼:

def scale_data(training_data, training_data_labels, testing_data, testing_data_labels):
    # Create X and Y scalers between 0 and 1
    x_scaler = MinMaxScaler(feature_range=(0, 1))
    y_scaler = MinMaxScaler(feature_range=(0, 1))

    # Scale training data
    x_scaled_training = x_scaler.fit_transform(training_data)
    y_scaled_training = y_scaler.fit_transform(training_data_labels)

    # Scale testing data
    x_scaled_testing = x_scaler.transform(testing_data)
    y_scaled_testing = y_scaler.transform(testing_data_labels)

    return x_scaled_training, y_scaled_training, x_scaled_testing, y_scaled_testing


def train_model(training_data, training_labels, testing_data, testing_labels, number_of_epochs, number_of_columns):
    model_hidden_neuron_number_list = []
    model_repeat_list = []
    model_error_rate_list = []
    for hidden_layer_1_units in range(int(np.floor(number_of_columns / 2)), int(np.ceil(number_of_columns * 2))):
        print("Training starting, number of hidden units = %d" % hidden_layer_1_units)
        for repeat in range(1, 6):
            print("Repeat %d" % repeat)
            model = k.Sequential()
            model.add(Dense(hidden_layer_1_units, input_dim=number_of_columns,
                        activation='relu', name='hidden_layer_1'))
            model.add(Dense(1, activation='linear', name='output_layer'))
            model.compile(loss='mean_squared_error', optimizer='adam')

            # Train Model
            model.fit(
                training_data,
                training_labels,
                epochs=number_of_epochs,
                shuffle=True,
                verbose=2,
                callbacks=[logger],
                batch_size=1024,
                validation_split=0.25
            )

            # Test Model
            test_error_rate = model.evaluate(testing_data, testing_labels, verbose=0)

            print("Error on testing data is %.3f" % test_error_rate)

            model_hidden_neuron_number_list.append(hidden_layer_1_units)
            model_repeat_list.append(repeat)
            model_error_rate_list.append(test_error_rate)

            # Save Model
            model_builder = tf.saved_model.builder.SavedModelBuilder("MLP/models/{hidden_layer_1_units}/{repeat}".format(hidden_layer_1_units=hidden_layer_1_units, repeat=repeat))

            inputs = {
            'input': tf.saved_model.build_tensor_info(model.input)
            }
            outputs = { 'time_remaining':tf.saved_model.utils.build_tensor_info(model.output)
            }

            signature_def = tf.saved_model.signature_def_utils.build_signature_def(
            inputs=inputs,
            outputs=outputs, method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
            )

            model_builder.add_meta_graph_and_variables(
                K.get_session(),
                tags=[tf.saved_model.tag_constants.SERVING],
                signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
                }
            )

        model_builder.save()

然后做一個預測:

file_name = top_level_file_path + "./MLP/models/19/1/"
    testing_dataset = pd.read_csv(file_path + os.listdir(file_path)[0])
    number_of_rows = len(testing_dataset.index)
    number_of_columns = len(testing_dataset.columns)
    newcol = [number_of_rows]
    max_time = testing_dataset['Time'].max()

    for j in range(0, number_of_rows - 1):
        newcol.append(max_time - testing_dataset.iloc[j].iloc[number_of_columns - 1])

    x_scaler = MinMaxScaler(feature_range=(0, 1))
    y_scaler = MinMaxScaler(feature_range=(0, 1))

    # Scale training data
    data_scaled = x_scaler.fit_transform(testing_dataset)
    labels = pd.read_csv("Labels.csv")
    labels_scaled = y_scaler.fit_transform(labels)

    signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
    input_key = 'input'
    output_key = 'time_remaining'


with tf.Session(graph=tf.Graph()) as sess:
    saved_model = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], file_name)
        signature = saved_model.signature_def

        x_tensor_name = signature[signature_key].inputs[input_key].name
        y_tensor_name = signature[signature_key].outputs[output_key].name

        x = sess.graph.get_tensor_by_name(x_tensor_name)
        y = sess.graph.get_tensor_by_name(y_tensor_name)

        #np.expand_dims(data_scaled[600], axis=0)
        predictions = sess.run(y, {x: data_scaled})
        predictions = y_scaler.inverse_transform(predictions)
        #print(np.round(predictions, 2))

        correct_result = 0
        for i in range(0, number_of_rows):
            correct_result = 0
            print(np.round(predictions[i]), " ", np.round(newcol[i]))
            if np.round(predictions[i]) == np.round(newcol[i]):
                correct_result += 1
        print((correct_result/number_of_rows)*100)

第一行的輸出應為96.0,但產生的結果為110.0,最后一行應為0.1,但當數據集中未出現負數時為-40.0。

進行回歸時無法計算准確性。 還要計算測試集的均方誤差。

其次,談到定標器時,您總是在訓練日期執行scaler.fit_transform ,以便定標器將在訓練數據上計算參數(在這種情況下,如果使用min-max定標器,則為minmax )。 然后,在對測試集執行推斷時,僅應在將數據輸入模型之前執行scaler.transform

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM