为什么我的逻辑回归模型准确率达到 100%？

Question

Import the libraries导入库

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn 
from sklearn import preprocessing
import seaborn as sns
%matplotlib inline

Reading the data读取数据

 df =pd.read_csv('./EngineeredData_2.csv')
    df =df.dropna()

Split the data into x and y:将数据拆分为 x 和 y：

X= df.drop (['Week','Div', 'Date', 'HomeTeam', 'AwayTeam','HTHG', 'HTAG','HTR', 
            'FTAG', 'FTHG','HGKPP', 'AGKPP', 'FTR'], axis =1)

Trarnsoforming y into integers:将 y 变换为整数：

 L = preprocessing.LabelEncoder ()
    matchresults = L.fit_transform (list (df['FTR']))
    y =list(matchresults)

Split the data into train and test:将数据拆分为训练和测试：

from sklearn.model_selection import train_test_split
X_tng,X_tst, y_tng, y_tst =train_test_split (X, y, test_size = 50, shuffle=False)
X_tng.head()

import the class导入类

from sklearn.linear_model import LogisticRegression

Instantiate the model实例化模型

logreg = LogisticRegression ()

Fit the model with the data用数据拟合模型

 logreg.fit (X_tng, y_tng)

Predict the test data y_pred = logreg.predict (X_tst)预测测试数据 y_pred = logreg.predict(X_tst)

    acc = logreg. score (X_tst, y_tst)
    print (acc)

Does the accuracy make sense to be 100%?准确率达到 100% 有意义吗？

Answer 1

The problem is that you unintentionally dropped all of your features and only retained your target value in x .问题是您无意中删除了所有功能，只保留了x中的目标值。 So, you are attempting to explain the target value with the target value itself, which of course will give you 100% accuracy.因此，您试图用目标值本身来解释目标值，这当然会给您 100% 的准确性。 You defined your features columns as:您将功能列定义为：

X= df.drop (['Week','Div', 'Date', 'HomeTeam', 'AwayTeam','HTHG', 'HTAG','HTR', 
            'FTAG', 'FTHG','HGKPP', 'AGKPP', 'FTR'], axis =1)

But you should have defined them as:但是您应该将它们定义为：

X= df.drop('FTR', axis =1)

为什么我的逻辑回归模型准确率达到 100%？

问题描述

1 个解决方案

解决方案1
0 2019-12-06 08:55:00

为什么我的逻辑回归模型准确率达到 100%？

问题描述

1 个解决方案

解决方案1 0 2019-12-06 08:55:00

解决方案1
0 2019-12-06 08:55:00