[英]`ValueError: The least populated class in y has only 1 member, which is too few` in PyCaret
I have a problem, working with PyCaret.我有一个问题,与 PyCaret 一起工作。 Previously I did not have any problems.
以前我没有任何问题。
But it started when I oversampled data and saved it, using pandas
and this question .但是当我使用
pandas
和这个问题对数据进行过采样并保存它时,它就开始了。
Then I read the file in a separate notebook.然后我在一个单独的笔记本中阅读了该文件。
import pycaret
from pycaret.utils import version
from pycaret.regression import *
from pycaret.classification import *
# Read clean data
starbucks_days = pd.read_csv('days_smote.csv')
# Drop a column
starbucks_days = starbucks_days.drop(['Unnamed: 0'], axis = 1)
starbucks_days = starbucks_days.drop(['transaction', 'offer_viewed', 'offer_received', 'offer_completed'], axis = 1)
starbucks_days = starbucks_days.drop(['label'], axis = 1)
The I start to use PyCaret我开始使用 PyCaret
# Initialize Setup
starbucks_days1 = setup(starbucks_days, target = 'time_completed_viewed', session_id = 123, log_experiment = True, experiment_name = 'days1')
But get an error但得到一个错误
ValueError: The least populated class in y has only 1 member, which is too few. ValueError: y 中最少填充的类只有 1 个成员,太少了。 The minimum number of groups for any class cannot be less than 2.
任何班级的最小组数不能少于 2。
This GitHub issue gives some hints这个 GitHub 问题给出了一些提示
I check some parameters我检查了一些参数
type(starbucks_days)
pandas.core.frame.DataFrame
starbucks_days['time_completed_viewed'].value_counts()
6.000000 1682
12.000000 1503
18.000000 1318
24.000000 1212
174.000000 1068
...
444.107530 1
226.213225 1
411.947513 1
236.001744 1
394.722944 1
Name: time_completed_viewed, Length: 3572, dtype: int64
Any tips what am I missing?任何提示我错过了什么? As I said, PyCaret works just fine with simple csv files, which were not oversampled.
正如我所说,PyCaret 可以很好地处理简单的 csv 文件,这些文件没有过采样。
In your imports, you have imported classification
after importing regression
that has overwritten the module in the environment.在您的导入中,您在导入覆盖环境中模块的
regression
后导入了classification
。
This seems like a regression problem (continuous value).这似乎是一个回归问题(连续值)。 You don't need to import
classification
.您不需要导入
classification
。
Get rid of this line from your code and it should work fine:从您的代码中删除这一行,它应该可以正常工作:
from pycaret.classification import *
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.