简体   繁体   中英

Mapping - Feature Importance vs Label classification

I have a set of data (200 rows) related to vanilla pound cake baking with 27 features, as below. The label caketaste is a measure how good the baked cake was, defined by bad(0) , neutral(1) , good(2) .

Features = cake_id, flour_g, butter_g, sugar_g, salt_g, eggs_count, bakingpowder_g, milk_ml, water_ml, vanillaextract_ml, lemonzest_g, mixingtime_min, bakingtime_min, preheattime_min, coolingtime_min, bakingtemp_c, preheattemp_c, color_red, color_green, color_blue, traysize_small, traysize_medium, traysize_large, milktype_lowfat, milktype_skim, milktype_whole, trayshape.

Label = caketaste ["bad", "neutral", "good"]

My task is to find:
a) the 5 most important features that affects the label's outcome;
b) to find the values of the identified 5 most important features that contributed to "good" classification in the label.

I am able to solve this using sklearn (Python), fitting the data with RandomForestClassifier(), then identify the 5 most important features using Feature_Importance() which is mixingtime_min , bakingtime_min , sugar_g , flour_g and preheattemp_c .

Minimal, Complete, and Verifiable Example:

#################################################################
# a) Libraries
#################################################################

import pandas as pd 
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.inspection import permutation_importance
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import MaxAbsScaler
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import accuracy_score
import time

#################################################################
# b) Data Loading Symlinks
#################################################################

df = pd.read_excel("poundcake.xlsx", sheet_name="Sheet0", engine='openpyxl')

#################################################################
# c) Analyzing Dataframe
#################################################################

#Getting dataframe details e.g columns, total entries, data types etc
print("\n<syntax>: df.info()")
df.info()

#Getting the 1st 5 lines in the dataframe
print("\n<syntax>: df.head()")
df.head()

#################################################################
# d) Data Visualization
#################################################################

#Scatterplot SiteID vs LTE - Spectral Efficiency
fig=plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.scatter(df["cake_id"], df["caketaste"], color='r')
ax.set_xlabel('cake_id')
ax.set_ylabel('caketaste')
ax.set_title('scatter plot')
plt.show()

#################################################################
# e) Feature selection
#################################################################

#Note: 
#Machine learning models cannot work well with categorical (string) data, specifically scikit-learn. 
#Need to convert the categorical variables into numeric types before building a machine learning model. 

categorical_columns = ["trayshape"]
numerical_columns = ["flour_g","butter_g","sugar_g","salt_g","eggs_count","bakingpowder_g","milk_ml","water_ml","vanillaextract_ml","lemonzest_g","mixingtime_min","bakingtime_min","preheattime_min","coolingtime_min","bakingtemp_c","preheattemp_c","color_red","color_green","color_blue","traysize_small","traysize_medium","traysize_large","milktype_lowfat","milktype_skim","milktype_whole"]

#################################################################
# f) Dataset (Train Test Split)
#
#                         (Dataset)
# ┌──────────────────────────────────────────┐  
#  ┌──────────────────────────┬────────────┐ 
#  |          Training        │ Test       │ 
#  └──────────────────────────┴────────────┘ 
#################################################################

# Prediction target - Training data
X = df[categorical_columns + numerical_columns]

# Prediction target - Training data
y = df["caketaste"] 

# Break off validation set from training data. Default: train_size=0.75, test_size=0.25
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8, test_size=0.2, random_state=42)

#################################################################
# Pipeline
#################################################################

#######################
# g) Column Transformer
#######################
categorical_encoder = OneHotEncoder(handle_unknown='ignore')

#Mean might not be suitable, Remove rows?
numerical_pipe = Pipeline([
    ('imp', SimpleImputer(strategy='mean'))
])

preprocessing = ColumnTransformer(
    [('cat', categorical_encoder, categorical_columns),
     ('num', numerical_pipe, numerical_columns)])

#####################
# b) Pipeline Printer
#####################
#RF: builds multiple decision trees and merges (bagging) them together 
#to get a more accurate and stable prediction (averaging).

pipe_xxx_xxx_rfo = Pipeline([
    ('pre', preprocessing),
    ('scl', None),
    ('pca', None),
    ('clf', RandomForestClassifier(random_state=42))
    ])

pipe_abs_xxx_rfo = Pipeline([
    ('pre', preprocessing),    
    ('scl', MaxAbsScaler()),
    ('pca', None),
    ('clf', RandomForestClassifier(random_state=42))
    ])

#################################################################
# h) Hyper-Parameter Tuning
#################################################################
parameters_rfo = {
        'clf__n_estimators':[100], 
        'clf__criterion':['gini'], 
        'clf__min_samples_split':[2,5], 
        'clf__min_samples_leaf':[1,2]
    }

parameters_rfo_bk = {
        'clf__n_estimators':[10,20,30,40,50,60,70,80,90,100,1000], 
        'clf__criterion':['gini','entropy'], 
        'clf__min_samples_split':[5,10,15,20,25,30], 
        'clf__min_samples_leaf':[1,2,3,4,5]
    }

#########################
# i) GridSearch Printer
#########################    

# scoring can be used as 'accuracy' or for MAE use 'neg_mean_absolute_error'  
scr='accuracy'

grid_xxx_xxx_rfo = GridSearchCV(pipe_xxx_xxx_rfo,
    param_grid=parameters_rfo,
    scoring=scr,
    cv=5,
    refit=True) 

grid_abs_xxx_rfo = GridSearchCV(pipe_abs_xxx_rfo,
    param_grid=parameters_rfo,
    scoring=scr,
    cv=5,
    refit=True)

print("Pipeline setup.... Complete")

###################################################
# Machine Learning Models Evaluation Algorithm
################################################### 
grids = [grid_xxx_xxx_rfo, grid_abs_xxx_rfo]   

grid_dict = {    0: 'RandomForestClassifier', 
                 1: 'RandomForestClassifier with AbsMaxScaler',   
        }

# Fit the grid search objects
print('Performing model optimizations...\n')
best_test_scr = -999999999999999 #Python3 does not allow to use None anymore
best_clf = 0
best_gs = ''

for idx, grid in enumerate(grids):
    start_time = time.time()

    print('*' * 100)
    print('\nEstimator: %s' % grid_dict[idx])   
    # Fit grid search   
    grid.fit(X_train, y_train)
    
    #Calculate the score once and use when needed
    test_scr = grid.score(X_test,y_test)
    train_scr = grid.score(X_train,y_train)
    
    # Track best (lowest grid.score) model
    if test_scr > best_test_scr:
        best_test_scr = test_scr
        best_train_scr = train_scr
        best_rf = grid
        best_clf = idx
        print("..........................this model is better. SELECTED")
    
    print("Best params                          : %s" % grid.best_params_)
    print("Training accuracy                    : %s" % best_train_scr)
    print("Test accuracy                        : %s" % best_test_scr)
    print("Modeling time                        : %s" % time.strftime("%H:%M:%S", time.gmtime(time.time() - start_time)))

print('\nClassifier with best test set score: %s' % grid_dict[best_clf])  

#########################################################################################
# j) Feature Importance using Gini Importance or Mean Decrease in Impurity (MDI)
# Note:
# 1.Calculates each feature importance as the sum over the number of splits (accross 
# all trees) that include the feature, proportionaly to the number of samples it splits.
# 2. Biased towards cardinality i.e numerical variables
########################################################################################

ohe = (best_rf.best_estimator_.named_steps['pre'].named_transformers_['cat'])
feature_names = ohe.get_feature_names(input_features=categorical_columns)
feature_names = np.r_[feature_names, numerical_columns]

tree_feature_importances = (best_rf.best_estimator_.named_steps['clf'].feature_importances_)
sorted_idx = tree_feature_importances.argsort()

# Figure: Top Features
count=-28
y_ticks = np.arange(0, abs(count))         
fig, ax = plt.subplots()
ax.barh(y_ticks[count:], tree_feature_importances[sorted_idx][count:])
ax.set_yticklabels(feature_names[sorted_idx][count:], fontsize=7)
ax.set_yticks(y_ticks[count:])
ax.set_title("Random Forest Tree's Feature Importance from Mean Decrease in Impurity (MDI)")
fig.tight_layout()
plt.show()

特征重要性

What approach one can use to solve task b)? I am trying to answer the below research question,

What are the value for mixingtime_min , bakingtime_min , flour_g , sugar_g and preheattemp_c that statistically contributed for a good caketaste (Good:2)?

Possible Expected Result:

mixingtime_min = [5,10,15] AND
bakingtime_min = [50,51,52,53,54,55] AND
flour_g = [150,160,170,180] AND
sugar_g = [200, 250] AND
preheattemp_c = [150,160,170]

The above result basically concludes if a person like to have a GOOD tasting cake, he need to bake his cake using 150-180g flour with 200-250g sugar and mixes the dough between 5-15mins, before baking it for another 50-55 mins in a preheated oven at 150-170ºC.

Hope you can give some pointers.

Question

Would you be able to guide me on how to go about approaching this research question?
Is there any library in sklearn or otherwise that able to get this information?
Any additional information such as confidence interval, outliers etc. is a bonus.

The data (poundcake.xlsx):

cake_id flour_g butter_g    sugar_g salt_g  eggs_count  bakingpowder_g  milk_ml water_ml    vanillaextract_ml   lemonzest_g mixingtime_min  bakingtime_min  preheattime_min coolingtime_min bakingtemp_c    preheattemp_c   color_red   color_green color_blue  traysize_small  traysize_medium traysize_large  milktype_lowfat milktype_skim   milktype_whole  trayshape   caketaste
0   180 50  250 2   3   3   15  80  1   2   10  30  25  15  170 175 1   0   0   1   0   0   1   0   0   square  1
1   195 50  500 6   6   1   30  60  1   2   10  40  30  10  170 170 0   1   0   1   0   0   0   1   0   rectangle   1
2   160 40  600 6   5   1   15  90  3   3   5   30  30  10  155 160 1   0   0   1   0   0   0   0   1   square  2
3   200 80  350 8   4   2   15  50  1   1   7   40  20  10  175 165 0   1   0   1   0   0   0   0   1   rectangle   0
4   175 90  400 6   6   4   25  90  1   1   9   60  25  15  160 155 1   0   0   0   0   1   0   1   0   rectangle   0
5   180 60  650 6   3   4   20  80  2   3   7   15  20  20  155 160 0   0   1   0   0   1   0   1   0   rectangle   2
6   165 50  200 6   4   2   20  80  1   2   7   60  30  20  150 170 0   1   0   1   0   0   1   0   0   rectangle   0
7   170 70  200 6   2   3   25  50  2   3   8   70  20  10  170 150 0   1   0   1   0   0   0   1   0   rectangle   1
8   160 90  300 8   4   4   25  60  3   2   9   35  30  15  175 170 0   1   0   1   0   0   1   0   0   square  1
9   165 50  350 6   4   1   25  80  1   2   11  30  10  10  170 170 1   0   0   0   1   0   1   0   0   square  1
10  180 90  650 4   3   4   20  50  2   3   8   30  30  15  165 170 1   0   0   1   0   0   0   1   0   square  1
11  165 40  350 6   2   2   30  60  3   3   5   50  25  15  175 170 0   0   1   1   0   0   0   0   1   rectangle   1
12  175 70  500 6   2   1   25  80  1   1   7   60  20  15  170 170 0   1   0   1   0   0   1   0   0   square  2
13  175 70  350 6   2   1   15  60  2   3   9   45  30  15  175 170 0   0   0   1   0   0   0   1   0   rectangle   1
14  160 70  600 4   6   4   30  60  2   3   5   60  25  10  150 155 0   1   0   1   0   0   0   1   0   rectangle   0
15  165 50  500 2   3   4   20  60  1   3   10  30  15  20  175 175 0   1   0   1   0   0   1   0   0   rectangle   0
16  195 50  600 6   5   2   25  60  1   1   5   30  10  20  170 150 0   0   0   1   0   0   0   0   1   square  2
17  160 60  600 8   5   4   25  70  3   3   9   30  30  10  175 150 0   0   0   1   0   0   1   0   0   rectangle   0
18  160 80  550 6   3   3   23  80  1   1   9   25  30  15  155 170 0   0   1   1   0   0   0   0   1   rectangle   1
19  170 60  600 4   5   1   20  90  3   3   10  55  20  15  165 155 0   0   1   1   0   0   0   0   1   square  0
20  175 70  300 6   5   4   25  70  1   1   11  65  15  20  170 155 0   0   1   1   0   0   0   1   0   round   0
21  195 80  250 6   6   2   23  70  2   3   11  20  30  15  170 155 0   0   1   1   0   0   1   0   0   rectangle   0
22  170 90  650 6   3   4   20  70  1   2   10  60  25  15  170 155 0   0   1   0   0   1   0   1   0   rectangle   1
23  180 40  200 6   3   1   15  60  3   1   5   35  15  15  170 170 0   1   0   1   0   0   0   1   0   rectangle   2
24  165 50  550 8   4   2   23  80  1   2   5   65  30  15  155 175 0   0   0   1   0   0   1   0   0   rectangle   1
25  170 50  250 6   2   3   25  70  2   2   6   30  20  15  165 175 0   0   0   0   0   1   0   1   0   rectangle   2
26  180 50  200 6   4   2   30  80  1   3   10  30  20  15  165 165 0   0   0   1   0   0   0   1   0   rectangle   2
27  200 90  500 6   3   4   25  70  2   1   9   60  30  15  170 160 0   1   0   1   0   0   0   1   0   rectangle   2
28  170 60  300 6   2   3   25  80  1   1   9   15  15  15  160 150 1   0   0   0   0   1   0   0   1   round   1
29  170 60  400 2   3   2   25  60  1   3   9   25  15  15  160 175 0   0   0   1   0   0   1   0   0   square  0
30  195 50  650 4   5   2   25  60  1   3   7   40  15  15  165 170 0   1   0   1   0   0   1   0   0   rectangle   1
31  170 50  350 6   6   1   25  80  2   2   8   50  25  15  150 170 0   1   0   1   0   0   1   0   0   rectangle   2
32  160 80  550 4   4   4   20  70  1   3   7   25  25  15  170 165 1   0   0   0   0   1   0   0   1   rectangle   2
33  170 50  300 4   4   2   23  50  2   2   10  30  20  15  150 170 0   0   0   1   0   0   1   0   0   rectangle   0
34  175 70  650 4   4   1   23  70  3   3   10  55  10  15  150 170 0   0   1   1   0   0   0   0   1   rectangle   0
35  180 70  400 6   2   2   20  60  1   1   6   55  30  15  170 150 0   0   0   1   0   0   1   0   0   square  2
36  195 60  300 6   6   4   23  70  2   2   10  30  30  15  170 175 1   0   0   1   0   0   1   0   0   rectangle   0
37  180 70  400 6   4   1   20  70  3   2   9   30  30  20  160 170 1   0   0   1   0   0   0   1   0   rectangle   2
38  170 90  600 8   3   1   20  50  1   2   9   30  30  15  155 170 1   0   0   1   0   0   0   1   0   rectangle   2
39  180 60  200 2   3   2   20  70  1   2   10  55  30  20  165 155 0   1   0   1   0   0   0   1   0   round   2
40  180 70  400 6   4   2   15  60  1   3   7   45  30  10  170 175 0   0   0   1   0   0   0   1   0   rectangle   2
41  170 70  200 6   3   1   30  60  3   2   6   40  15  15  170 175 0   0   1   1   0   0   0   0   1   rectangle   2
42  170 60  550 6   3   4   20  80  1   2   9   60  20  15  150 165 1   0   0   1   0   0   1   0   0   round   2
43  170 50  600 6   4   3   30  60  1   2   11  15  30  15  155 150 1   0   0   0   1   0   1   0   0   rectangle   0
44  175 70  200 4   4   3   30  70  3   2   6   20  10  20  170 170 0   0   0   1   0   0   1   0   0   rectangle   1
45  195 70  500 8   4   2   25  60  2   3   6   15  30  15  165 170 1   0   0   0   0   1   0   1   0   rectangle   2
46  180 80  200 4   4   4   15  80  1   3   6   50  30  15  155 150 0   0   0   1   0   0   0   1   0   rectangle   2
47  165 50  350 6   4   2   20  60  1   1   9   40  20  15  150 155 0   0   0   1   0   0   1   0   0   rectangle   0
48  170 70  550 2   2   4   20  60  3   2   9   55  30  15  165 165 0   1   0   1   0   0   0   0   1   round   0
49  175 70  350 6   5   4   30  80  1   2   9   55  30  10  155 170 0   0   0   0   0   1   1   0   0   rectangle   1
50  180 50  400 6   4   3   25  50  2   2   9   20  20  20  160 160 0   0   0   1   0   0   0   1   0   rectangle   2
51  165 50  650 6   5   4   20  60  1   2   5   60  30  15  175 170 0   0   1   1   0   0   0   0   1   square  0
52  170 70  200 2   6   3   25  60  1   3   8   35  25  15  170 155 1   0   0   1   0   0   0   0   1   rectangle   1
53  180 40  350 4   4   3   30  60  3   2   12  45  30  15  150 175 0   0   0   1   0   0   0   1   0   rectangle   1
54  175 50  600 8   3   1   20  80  2   1   7   30  15  15  150 160 0   0   0   1   0   0   0   0   1   square  0
55  175 70  400 4   3   1   25  90  1   2   5   50  30  10  170 170 1   0   0   0   0   1   1   0   0   rectangle   1
56  170 50  650 6   6   3   20  70  1   1   6   25  30  15  170 160 1   0   0   1   0   0   0   1   0   rectangle   2
57  200 70  650 6   3   1   15  60  2   1   10  25  10  15  170 150 0   1   0   1   0   0   0   0   1   rectangle   2
58  175 80  650 6   5   2   23  70  1   1   5   45  15  15  160 170 0   1   0   1   0   0   0   0   1   rectangle   1
59  170 50  200 8   3   4   30  70  1   3   11  35  25  15  170 170 0   0   0   1   0   0   0   1   0   rectangle   1
60  170 60  300 6   3   1   20  60  3   3   11  20  30  15  170 170 1   0   0   1   0   0   0   0   1   rectangle   0
61  180 40  350 2   4   3   20  70  3   2   12  20  10  15  150 160 0   0   0   1   0   0   1   0   0   square  2
62  175 60  200 6   6   1   15  80  2   2   12  25  20  15  155 160 1   0   0   1   0   0   0   0   1   rectangle   2
63  170 70  650 6   2   3   23  90  3   3   10  25  30  20  170 155 1   0   0   1   0   0   0   1   0   rectangle   2
64  170 70  600 6   4   2   25  80  2   2   6   50  15  15  170 155 0   0   0   1   0   0   0   1   0   rectangle   0
65  170 60  250 6   2   2   30  60  1   2   9   20  15  10  165 165 0   0   0   1   0   0   0   1   0   rectangle   2
66  175 50  650 4   2   1   23  60  2   2   11  20  30  20  170 175 1   0   0   1   0   0   0   1   0   rectangle   1
67  175 70  350 4   3   3   30  50  1   2   10  35  25  15  175 170 0   0   0   1   0   0   1   0   0   rectangle   0
68  165 90  600 6   5   2   23  60  1   3   9   55  10  15  160 165 0   1   0   1   0   0   1   0   0   square  0
69  200 80  600 6   3   1   30  60  2   1   8   30  30  15  175 165 1   0   0   0   1   0   0   0   1   rectangle   1
70  165 50  200 6   5   2   23  60  2   1   12  55  30  15  170 170 0   0   0   0   0   1   0   0   1   round   0
71  175 60  300 4   6   1   15  60  3   2   12  55  20  15  175 165 0   0   0   1   0   0   0   0   1   square  0
72  175 70  200 8   5   4   20  60  1   3   12  60  25  15  175 170 0   1   0   1   0   0   0   1   0   rectangle   2
73  180 60  200 4   4   4   30  70  1   3   8   35  30  10  175 170 0   0   0   1   0   0   1   0   0   rectangle   2
74  170 80  650 6   3   1   30  60  1   2   5   55  30  20  155 175 1   0   0   1   0   0   0   0   1   rectangle   2
75  170 60  500 8   4   1   23  60  3   2   7   60  30  15  165 170 0   0   0   1   0   0   0   1   0   square  2
76  175 70  250 6   4   2   30  60  1   2   12  65  20  15  170 160 1   0   0   0   0   1   0   0   1   square  2
77  180 50  500 8   5   1   15  70  3   3   8   40  10  15  165 155 0   0   1   0   1   0   0   0   1   rectangle   1
78  175 60  550 6   4   2   20  90  1   2   7   25  30  15  175 165 0   1   0   1   0   0   0   0   1   rectangle   0
79  170 70  600 8   4   4   15  80  3   3   11  60  30  15  175 150 1   0   0   1   0   0   0   0   1   rectangle   1
80  195 60  200 4   5   3   30  60  1   2   8   30  20  15  170 170 0   1   0   1   0   0   0   1   0   square  0
81  180 70  300 6   3   3   20  90  1   3   11  25  20  10  170 150 0   0   0   1   0   0   0   1   0   rectangle   0
82  170 40  550 2   4   3   30  60  1   2   9   35  30  10  170 170 0   0   0   0   0   1   0   1   0   square  1
83  175 60  550 6   5   2   15  90  1   1   11  30  10  15  170 175 1   0   0   1   0   0   0   0   1   rectangle   0
84  180 50  350 4   4   3   23  50  2   2   7   20  30  10  170 175 0   0   0   1   0   0   0   0   1   rectangle   2
85  180 80  600 4   4   1   25  60  1   1   5   55  30  10  170 165 0   0   1   1   0   0   0   0   1   rectangle   1
86  175 50  650 8   2   3   15  50  1   2   10  50  25  15  160 160 0   0   0   1   0   0   0   0   1   square  0
87  175 50  350 2   6   3   23  80  2   2   10  20  25  15  170 155 1   0   0   1   0   0   0   0   1   rectangle   1
88  170 50  350 4   2   4   25  60  2   1   10  20  15  15  150 155 0   1   0   1   0   0   1   0   0   rectangle   0
89  180 50  550 6   5   4   30  90  2   3   7   60  30  15  155 175 0   0   0   1   0   0   0   1   0   rectangle   2
90  170 70  600 6   5   3   15  90  1   2   6   45  10  15  170 170 0   1   0   1   0   0   1   0   0   round   1
91  170 70  300 4   4   2   20  60  1   1   10  15  30  10  165 155 0   0   0   1   0   0   1   0   0   rectangle   1
92  180 50  650 4   2   4   20  80  1   2   8   65  30  15  150 160 0   1   0   1   0   0   0   0   1   rectangle   2
93  170 50  350 6   3   3   30  60  1   3   7   55  30  20  155 170 1   0   0   1   0   0   1   0   0   rectangle   0
94  170 90  400 6   4   1   30  60  3   2   12  70  30  15  170 160 0   0   1   1   0   0   0   1   0   rectangle   1
95  160 70  400 2   6   4   23  70  2   1   9   20  30  10  150 175 0   0   0   1   0   0   0   0   1   square  1
96  170 80  250 4   2   3   30  60  3   1   10  30  30  15  155 165 0   0   0   0   0   1   0   0   1   rectangle   1
97  195 70  250 6   6   4   30  80  3   1   11  20  15  15  170 170 1   0   0   1   0   0   0   0   1   rectangle   2
98  180 50  650 6   6   1   30  90  3   1   7   25  15  15  170 170 1   0   0   1   0   0   0   0   1   rectangle   2
99  195 50  200 6   3   1   23  90  1   1   9   55  25  15  160 170 0   0   0   1   0   0   0   0   1   rectangle   0
100 175 50  200 4   3   3   20  50  2   2   12  15  30  10  170 170 0   0   1   1   0   0   0   1   0   square  1
101 165 70  350 4   4   4   15  90  1   2   12  40  15  15  155 155 0   1   0   1   0   0   0   0   1   rectangle   1
102 180 80  600 4   4   3   25  50  1   2   11  30  10  15  155 170 0   0   1   1   0   0   0   0   1   rectangle   1
103 165 50  300 6   3   1   30  60  1   1   9   40  25  15  160 170 0   0   0   1   0   0   0   1   0   rectangle   1
104 160 50  600 8   2   4   20  60  1   2   12  60  30  15  170 170 0   0   0   1   0   0   1   0   0   square  2
105 170 90  200 2   2   2   15  60  3   2   5   40  20  15  170 160 0   0   0   1   0   0   0   1   0   rectangle   2
106 175 90  600 6   4   2   15  60  1   1   7   20  30  15  175 170 1   0   0   0   0   1   0   1   0   rectangle   2
107 180 70  550 6   3   1   15  90  1   1   9   25  30  15  150 160 1   0   0   1   0   0   0   1   0   rectangle   2
108 170 90  250 8   4   4   30  60  2   3   6   60  25  15  155 155 0   0   0   1   0   0   0   0   1   rectangle   0
109 200 40  500 6   6   2   20  60  3   2   10  50  30  15  170 155 0   0   0   1   0   0   1   0   0   rectangle   0
110 175 70  500 2   3   4   30  60  3   2   5   65  20  15  170 155 1   0   0   1   0   0   0   0   1   rectangle   2
111 165 60  550 6   3   2   30  80  2   1   9   20  25  20  170 175 0   0   0   1   0   0   0   0   1   rectangle   2
112 195 70  350 6   6   2   25  90  2   2   12  50  30  15  150 165 0   0   1   1   0   0   0   1   0   square  2
113 165 90  300 4   3   4   30  60  1   2   9   30  25  15  165 170 0   1   0   0   0   1   0   1   0   rectangle   0
114 195 40  650 6   2   1   23  80  1   2   5   25  25  15  170 165 0   1   0   1   0   0   0   1   0   rectangle   1
115 175 60  200 2   4   3   15  50  3   3   6   25  30  15  155 170 1   0   0   1   0   0   1   0   0   square  0
116 175 70  400 6   4   3   15  60  2   3   11  20  20  15  150 170 1   0   0   0   1   0   0   1   0   rectangle   2
117 195 70  350 6   3   2   30  60  3   2   12  25  25  20  175 175 0   0   0   1   0   0   0   0   1   rectangle   2
118 170 50  500 6   4   3   30  80  2   3   10  60  30  15  170 160 0   1   0   1   0   0   0   0   1   rectangle   0
119 195 60  650 6   4   1   20  70  3   2   5   65  20  20  170 150 0   0   1   0   0   1   0   0   1   rectangle   2
120 170 70  650 8   4   4   25  80  1   2   9   45  30  15  170 170 0   0   1   1   0   0   0   1   0   round   1
121 170 70  650 8   4   2   30  90  1   2   12  30  15  15  170 170 0   0   1   1   0   0   1   0   0   square  0
122 170 60  400 4   6   4   15  60  2   2   11  60  30  15  170 150 0   0   1   1   0   0   1   0   0   square  0
123 175 60  300 8   6   3   20  60  2   2   12  50  25  15  150 175 0   0   1   0   1   0   0   1   0   round   2
124 175 50  400 4   3   1   23  50  3   2   9   50  30  15  150 150 0   0   1   1   0   0   0   1   0   square  0
125 180 40  300 6   4   1   15  50  3   2   10  60  30  15  170 175 0   0   1   0   1   0   0   1   0   rectangle   2
126 195 60  250 6   4   3   25  90  2   2   6   60  30  10  170 175 1   0   0   0   0   1   0   0   1   rectangle   2
127 160 70  300 4   2   1   20  60  2   2   5   40  20  15  160 170 0   0   0   1   0   0   0   1   0   square  2
128 170 60  300 8   6   2   30  80  1   1   10  65  30  15  155 155 0   1   0   1   0   0   0   0   1   square  2
129 160 40  350 6   6   2   15  60  1   1   5   25  30  15  155 170 0   0   1   0   0   1   0   1   0   rectangle   2
130 170 60  500 2   5   3   30  50  3   2   10  60  10  15  165 160 0   0   0   1   0   0   1   0   0   rectangle   1
131 170 60  650 8   3   3   23  90  1   1   10  70  15  15  170 175 1   0   0   1   0   0   1   0   0   rectangle   2
132 170 50  600 4   4   1   20  50  2   2   5   60  25  15  170 160 1   0   0   1   0   0   0   0   1   square  2
133 180 50  350 6   5   2   25  90  3   2   5   20  30  15  175 160 0   0   0   1   0   0   1   0   0   rectangle   0
134 170 90  200 4   2   4   20  90  3   2   10  20  25  15  170 175 0   0   0   1   0   0   0   0   1   rectangle   1
135 200 40  350 6   6   1   30  80  1   1   5   60  25  20  170 175 0   0   1   1   0   0   0   1   0   rectangle   2
136 165 60  250 2   3   2   25  60  1   1   8   20  15  15  170 170 0   1   0   1   0   0   0   0   1   rectangle   0
137 175 70  250 6   6   4   15  60  2   2   11  50  30  15  175 175 0   1   0   0   0   1   0   1   0   rectangle   2
138 180 50  350 6   4   2   25  70  3   2   5   45  25  15  170 170 0   0   0   0   0   1   1   0   0   rectangle   0
139 195 60  600 6   4   2   20  50  1   1   10  35  15  15  165 175 1   0   0   1   0   0   0   1   0   round   2
140 180 60  300 8   4   4   25  80  1   1   5   60  30  15  165 170 0   0   0   1   0   0   0   0   1   rectangle   1
141 200 60  500 8   4   1   23  70  2   2   8   15  30  15  160 170 0   0   0   1   0   0   1   0   0   rectangle   0
142 170 60  550 6   4   4   30  60  2   2   6   65  20  15  175 165 0   1   0   1   0   0   0   0   1   rectangle   1
143 170 40  600 2   2   1   15  70  1   2   11  30  25  20  175 165 0   0   0   1   0   0   0   0   1   rectangle   0
144 175 70  250 6   4   3   30  60  1   2   10  60  30  20  155 175 0   1   0   1   0   0   1   0   0   rectangle   2
145 180 50  250 4   5   3   15  80  1   2   6   60  30  15  170 170 0   0   0   1   0   0   0   0   1   rectangle   2
146 165 50  350 6   4   4   25  80  1   2   12  25  15  15  155 165 1   0   0   1   0   0   0   0   1   rectangle   0
147 170 60  500 6   5   4   23  60  1   2   10  15  30  20  160 170 1   0   0   1   0   0   1   0   0   rectangle   1
148 170 50  400 6   4   3   20  60  2   3   6   35  10  15  170 175 0   0   1   1   0   0   0   0   1   rectangle   1
149 195 80  650 8   4   3   30  90  1   1   6   15  20  10  165 160 1   0   0   0   1   0   1   0   0   rectangle   2
150 165 90  500 8   3   4   20  60  2   2   5   25  30  15  165 170 0   1   0   0   0   1   0   0   1   rectangle   1
151 160 80  200 2   4   4   30  80  3   1   5   50  25  15  170 160 0   1   0   1   0   0   0   1   0   rectangle   0
152 180 50  500 2   6   1   15  60  1   1   8   65  20  15  170 170 1   0   0   0   0   1   1   0   0   rectangle   2
153 165 60  600 6   4   1   30  70  3   3   11  15  30  10  170 170 0   0   0   1   0   0   1   0   0   rectangle   0
154 180 60  600 2   3   2   30  70  1   2   6   55  15  15  150 165 1   0   0   1   0   0   0   0   1   rectangle   2
155 160 60  400 2   6   4   15  60  1   1   9   55  30  10  170 160 1   0   0   1   0   0   1   0   0   rectangle   0
156 180 60  250 4   3   2   25  80  3   1   6   25  25  20  170 160 0   1   0   0   1   0   0   1   0   square  2
157 195 50  200 6   4   3   30  70  3   2   6   35  30  15  165 170 1   0   0   0   0   1   1   0   0   rectangle   2
158 170 50  650 6   5   2   15  60  3   2   12  35  30  10  170 175 1   0   0   0   1   0   0   1   0   rectangle   0
159 160 70  400 6   3   2   20  50  1   2   9   20  30  15  155 155 0   0   1   0   0   1   1   0   0   rectangle   0
160 175 90  600 6   4   4   23  80  3   3   7   20  20  15  155 160 1   0   0   1   0   0   0   1   0   rectangle   0
161 180 50  400 4   4   1   23  70  1   2   12  20  30  20  165 170 0   1   0   1   0   0   0   0   1   rectangle   1
162 170 90  250 6   3   3   20  80  2   2   12  25  15  15  170 155 0   0   0   1   0   0   0   1   0   round   2
163 170 60  200 2   6   1   23  80  3   1   10  30  30  15  170 175 0   1   0   0   0   1   0   1   0   rectangle   2
164 175 50  650 2   5   3   25  70  3   2   11  60  25  15  175 160 0   1   0   1   0   0   0   0   1   rectangle   2
165 195 90  400 6   3   3   23  60  1   2   7   35  25  20  170 155 0   0   0   1   0   0   1   0   0   round   1
166 180 50  600 6   3   4   25  60  2   2   10  20  10  15  155 175 0   1   0   1   0   0   0   1   0   square  0
167 200 50  500 6   3   3   15  90  2   1   6   20  25  10  170 155 0   1   0   1   0   0   0   0   1   rectangle   1
168 200 60  200 6   2   3   20  60  3   3   5   20  10  15  170 170 1   0   0   1   0   0   0   1   0   rectangle   1
169 200 60  300 4   5   3   20  90  3   2   12  30  25  15  155 160 0   0   1   1   0   0   0   0   1   rectangle   0
170 180 70  250 6   4   3   30  50  1   2   12  35  25  10  155 150 0   0   0   1   0   0   1   0   0   rectangle   1
171 175 70  200 4   6   4   30  60  2   2   5   25  30  15  150 160 0   0   1   1   0   0   0   1   0   square  0
172 165 90  400 2   5   1   30  90  3   2   6   70  30  15  170 170 0   1   0   1   0   0   0   0   1   rectangle   2
173 165 70  200 6   6   4   20  70  1   1   5   65  20  20  175 155 0   0   0   1   0   0   0   1   0   round   0
174 180 50  650 2   3   3   20  70  3   2   12  40  30  15  155 170 0   0   0   1   0   0   0   0   1   rectangle   1
175 180 40  200 6   3   2   30  80  3   3   7   60  30  10  175 150 0   1   0   1   0   0   1   0   0   rectangle   2
176 180 60  400 2   5   3   20  50  1   3   5   20  30  15  175 150 0   1   0   1   0   0   0   1   0   rectangle   1
177 200 50  400 4   6   4   23  60  2   2   7   55  20  15  160 170 0   1   0   1   0   0   0   0   1   round   0
178 180 50  550 6   4   3   20  50  2   2   8   20  25  20  170 170 1   0   0   0   0   1   1   0   0   rectangle   0
179 175 70  250 8   4   1   20  50  2   3   6   60  30  15  170 170 0   0   0   0   1   0   0   0   1   square  0
180 195 70  400 6   4   4   23  60  3   1   7   65  25  15  170 150 1   0   0   1   0   0   0   1   0   rectangle   1
181 160 50  500 6   4   3   25  50  1   1   11  55  10  15  170 170 0   0   0   0   0   1   0   0   1   rectangle   1
182 180 90  500 6   3   3   23  60  2   1   8   20  30  15  170 170 0   0   0   0   0   1   0   1   0   rectangle   1
183 170 70  650 2   3   3   25  80  1   3   8   45  20  10  170 170 0   1   0   1   0   0   0   1   0   round   2
184 195 70  600 6   4   2   25  60  1   2   6   40  30  15  155 170 1   0   0   1   0   0   0   0   1   rectangle   1
185 165 70  200 6   4   1   20  60  1   2   8   45  15  15  170 150 0   1   0   1   0   0   0   0   1   round   1
186 165 80  200 4   4   3   30  60  1   1   8   25  30  10  160 170 0   1   0   1   0   0   1   0   0   round   0
187 175 60  600 4   2   3   20  60  1   2   6   25  20  15  170 155 0   0   0   1   0   0   1   0   0   rectangle   2
188 180 70  500 6   4   3   30  70  2   2   7   55  30  15  170 150 1   0   0   1   0   0   1   0   0   square  1
189 180 50  600 2   4   4   30  60  3   1   9   40  25  15  170 170 1   0   0   0   0   1   0   0   1   rectangle   0
190 160 50  600 8   3   2   20  60  3   2   12  30  30  15  165 150 0   0   0   0   1   0   1   0   0   rectangle   2
191 180 60  200 6   2   1   30  60  3   2   7   20  30  15  175 160 1   0   0   1   0   0   1   0   0   rectangle   2
192 195 70  600 6   4   3   23  80  2   2   12  50  25  10  170 170 0   0   0   0   0   1   1   0   0   rectangle   1
193 180 60  250 6   3   1   15  60  2   3   5   60  30  20  175 165 1   0   0   0   0   1   0   1   0   rectangle   1
194 170 70  250 6   4   1   20  90  2   2   10  25  20  20  175 170 0   0   0   1   0   0   0   1   0   round   1
195 180 90  250 6   3   1   25  50  1   2   9   55  30  15  170 175 1   0   0   0   0   1   1   0   0   rectangle   1
196 160 70  550 6   3   4   30  90  3   2   10  60  20  15  165 165 0   1   0   1   0   0   0   1   0   round   0
197 175 60  200 8   2   3   15  60  1   2   11  50  30  15  165 175 0   1   0   1   0   0   0   0   1   rectangle   1
198 170 80  500 6   3   2   25  50  1   1   5   60  20  15  175 150 1   0   0   1   0   0   0   1   0   square  2
199 180 50  600 4   4   4   15  80  1   1   5   50  20  15  170 170 1   0   0   1   0   0   0   1   0   rectangle   2

Very simple solution could be run a Decision tree classifier with your data & Visualize the tree using grapviz library here's the documentation https://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html , You could also visualize in webgraphiz after you get the dot file generated from the code. Outcome of this exercise can be the range values you are expecting.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM