简体   繁体   English

Sklearn LogisticRegression求解器需要2类数据

[英]Sklearn LogisticRegression solver needs 2 classes of data

I'm trying to run a Logistic Regression via sklearn: 我正在尝试通过sklearn运行Logistic回归:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import datetime as dt
import pandas as pd
import numpy as np
import talib
import matplotlib.pyplot as plt
import seaborn as sns

col_names = ['dates','prices']
# load dataset
df = pd.read_csv("DJI2.csv", header=None, names=col_names)

df.drop('dates', axis=1, inplace=True)
print(df.shape)
df['3day MA'] = df['prices'].shift(1).rolling(window = 3).mean()
df['10day MA'] = df['prices'].shift(1).rolling(window = 10).mean()
df['30day MA'] = df['prices'].shift(1).rolling(window = 30).mean()
df['Std_dev']= df['prices'].rolling(5).std()
df['RSI'] = talib.RSI(df['prices'].values, timeperiod = 9)
df['Price_Rise'] = np.where(df['prices'].shift(-1) > df['prices'], 1, 0)
df = df.dropna()

xCols = ['3day MA', '10day MA', '30day MA', 'Std_dev', 'RSI', 'prices']
X = df[xCols]
X = X.astype('int')
Y = df['Price_Rise']
Y = Y.astype('int')

logreg = LogisticRegression()

for i in range(len(X)):
   #Without this case below I get: ValueError: Found array with 0 sample(s) (shape=(0, 6)) while a minimum of 1 is required.
    if(i == 0): 
       continue
    logreg.fit(X[:i], Y[:i])

However, when i try to run this code I get the following error: 但是,当我尝试运行此代码时,出现以下错误:

ValueError: 
This solver needs samples of at least 2 classes in the data, but the data contains only one class: 58

The shape of my X data is: (27779, 6) The shape of my Y data is: (27779,) 我的X数据的形状为: (27779, 6)我的Y数据的形状为: (27779,)

Here is a df.head(3) example to see what my data looks like: 这是一个df.head(3)示例,以查看我的数据是什么样的:

     prices    3day MA  10day MA   30day MA   Std_dev        RSI  Price_Rise
30   58.11  57.973333    57.277  55.602333  0.247123  81.932338           1
31   58.42  58.043333    57.480  55.718667  0.213542  84.279674           1
32   58.51  58.216667    57.667  55.774000  0.249139  84.919586           0

I've tried searching for where I am getting this issue from myself, but I've only managed to find these two answers, both of which discuss the issue as a bug in sklearn, however they are both approx. 我曾尝试搜索自己从何处获得此问题,但我仅设法找到了 两个答案, 两个问题都作为sklearn中的错误进行了讨论,但是两者都差不多。 two years old so I do not think that I am having the same issue. 两岁,所以我不认为我遇到了同样的问题。

You should make sure you have two unique values in Y[:i]. 您应该确保在Y [:i]中有两个唯一值。 So before your loop, add something like: 因此,在循环之前,请添加以下内容:

starting_i = 0
for i in range(len(X)):
   if np.unique(Y[:i]) == 2:
      starting_i = i

Then just check that starting_i isn't 0 before running your main loop. 然后只需在运行主循环之前检查start_i不为0。 Or even simpler, you can find the first occurrence where Y[i] != Y[0]. 或更简单地说,您可以找到第一个出现的地方,其中Y [i]!= Y [0]。

if i in range (0,3): 
    continue

Fixed this issue. 解决了此问题。 Y[:i] was not unique before i = 3. Y [:i]在i = 3之前不是唯一的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用带有“saga”求解器和“elasticnet”惩罚的 sklearn 的 LogisticRegression - Using sklearn's LogisticRegression with 'saga' solver and 'elasticnet' penalty Sklearn LogisticRegression方程说明 - Sklearn LogisticRegression equation clarification 这个求解器需要数据中至少有 2 个类的样本,但数据只包含一个类:1 - This solver needs samples of at least 2 classes in the data, but the data contains only one class: 1 ValueError:此解算器需要数据中至少2个类的样本,但数据只包含一个类:1.0 - ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 1.0 sklearn LogisticRegression 没有正则化 - sklearn LogisticRegression without regularization sklearn LogisticRegression python中的alpha - alpha in sklearn LogisticRegression python sklearn有错误(LogisticRegression模型选择) - There is an error with sklearn (LogisticRegression model selection) ValueError:估计器 LogisticRegression 的参数求解器无效 - ValueError: Invalid parameter solver for estimator LogisticRegression sklearn LogisticRegression.predict中丢失的概率 - probability missing in sklearn LogisticRegression.predict sklearn LogisticRegression的densify()和sparsify()方法的output是什么 - What is the output of densify() and sparsify() methods of sklearn LogisticRegression
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM