简体   繁体   中英

Why does the algorithm not accept float numbers?

Task : There are two teachers. They give grades for a student's work. There's a final grade. It is equal to the teacher's grade if they give the same grade. Or the final grade is -1 if the teachers have made different grades. I want to teach the computer to see this logic.

Data :

  1. Rate1 - First teacher assessment
  2. Rate2 - Second teacher assessment
  3. Result - Final evaluation

Example :

0,1; 0,1 => 0,1

0,7; 0,7 => 0,7

0,3; 0,2 => -1

My code:

import pandas
MyData = pandas.read_excel("train.xlsx")
input_data = MyData.drop("Result", axis=1)
target = MyData.Result
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(input_data, target)

And after that, I get the next mistake. If all my estimates are integers, this error is absent. But I have to work with fractions.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-51-433a918946a9> in <module>
----> 1 model.fit(input_data, target)

~\anaconda3\lib\site-packages\sklearn\ensemble\_forest.py in fit(self, X, y, sample_weight)
    319         self.n_outputs_ = y.shape[1]
    320 
--> 321         y, expanded_class_weight = self._validate_y_class_weight(y)
    322 
    323         if getattr(y, "dtype", None) != DOUBLE or not y.flags.contiguous:

~\anaconda3\lib\site-packages\sklearn\ensemble\_forest.py in _validate_y_class_weight(self, y)
    539 
    540     def _validate_y_class_weight(self, y):
--> 541         check_classification_targets(y)
    542 
    543         y = np.copy(y)

~\anaconda3\lib\site-packages\sklearn\utils\multiclass.py in check_classification_targets(y)
    167     if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
    168                       'multilabel-indicator', 'multilabel-sequences']:
--> 169         raise ValueError("Unknown label type: %r" % y_type)
    170 
    171 

ValueError: Unknown label type: 'continuous'

ps My data is here .

How can I work with fractional numbers?

You are trying to pass float values to a classifier. Try instead with Decision Tree Regressor, where you can pass integer values.

Here is an example code:

import pandas as pd

df = pd.read_csv(r'train.csv')

X = df.iloc[:, :1].values
y = df.iloc[:, 2].values

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)

from sklearn.tree import DecisionTreeRegressor 

DTR = DecisionTreeRegressor()

DTR.fit(X_train, y_train)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM