簡體   English   中英

如何使用 SKlearn 預測單個值?

[英]How to predict an individual value using SKlearn?

我對機器學習非常陌生,我想為我在創建的預測模型中傳遞的單個數組返回一個百分比。

我不確定如何獲得匹配百分比。 我以為是metrics.accuracy_score(Ytest, y_pred)但是當我嘗試它時它給了我以下錯誤:
**ValueError: Found input variables with inconsistent numbers of samples: [4, 1]**

我不知道這是否是正確的方法。

import numpy as np                  #linear algebra
import pandas as pd                 # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt     #For Visualisation
import seaborn as sns               #For better Visualisation
from bs4 import BeautifulSoup       #For Text Parsing
import mysql.connector
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import joblib
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.naive_bayes import GaussianNB
import docx2txt
import re
import csv
from sklearn import metrics

class Machine:

    TrainData       = ''


    def __init__(self):


        self.TrainData          = self.GetTrain()

        Data                    = self.ProcessData()

        x                       = Data[0]
        y                       = Data[1]

        x, x_test, y, y_test    = train_test_split(x,y, stratify = y, test_size = 0.25, random_state = 42)

        self.Predict(x,y, '',x_test , y_test )

    def Predict(self,X,Y,Data, Xtext, Ytest):

        model = GaussianNB()
        model.fit(Xtext, Ytest)

        y_pred = model.predict([[1.0, 2.00613, 2, 5]])

        print("Accuracy:",metrics.accuracy_score(Ytest, y_pred))
        


    def ProcessData(self):

            X = []
            Y = []
            i = 0
            for I in self.TrainData:

                Y.append(I[4])
                X.append(I)

                i = i + 1

            i = 0
            for j in X:

                X[i][0] = float(X[i][0])
                X[i][1] = float(X[i][1])
                X[i][2] = int(X[i][2])
                X[i][3] = int(X[i][3])
                del X[i][4]

                i = i + 1

            return X,Y


    def GetTrain(self):
        file        = open('docs/training/TI_Training.csv')
        csvreader   = csv.reader(file)

        header      = []
        header      = next(csvreader)

        rows        = []

        for row in csvreader:
            rows.append(row)

        file.close()

        return rows



Machine()

錯誤很明顯: YTest有 4 個樣本,而y_pred只有一個。 您需要每個樣本中相同數量的樣本才能獲得任何指標。 我懷疑你反而想做

y_pred = model.predict(Xtext)

在您的Predict功能中。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM