简体   繁体   English

Keras 拟合 Model 时出现断言错误

[英]Keras Assertion Error while Fitting Model

I am trying to replicate this model on my dataset: https://docs.seldon.io/projects/alibi/en/stable/examples/cem_iris.html我正在尝试在我的数据集上复制此 model: https://docs.seldon.io/projects/alibi/en/stable/examples/cem_iris.ZFC35FDC70D5FC69D269883A82E2C

TF.Version: 2.8.2 Keras Vesion: 2.8.0 I'm getting "Assertion Error" when I try to fit the model. TF.Version:2.8.2 Keras 版本:2.8.0 当我尝试安装 model 时出现“断言错误”。 My dataset has 59columns and the target variable has 3-classes.我的数据集有 59 列,目标变量有 3 个类别。

The code:编码:

df=pd.read_csv('file.csv')
df= df.dropna(subset=['column names'])

X = df.drop(columns=['target'], axis = 1)
y = df['target']

num_pipeline = Pipeline([
        ('std_scaler', StandardScaler())             
    ])

X = num_pipeline.fit_transform(X)
idx = 1000
x_train,y_train = X[:idx,:], y[:idx]
x_test, y_test = X[idx+1:,:], y[idx+1:]
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

def lr_model():
    x_in = Input(shape=(59,))
    x_out = Dense(3, activation='softmax')(x_in)
    lr = Model(inputs=x_in, outputs=x_out)
    lr.compile(loss='categorical_crossentropy',
               optimizer='rmsprop', metrics=['accuracy'])
    return lr

lr = lr_model()
lr.summary()
lr.fit(x_train, y_train, batch_size=180, epochs=500, verbose=0)

The output and the full error traceback: output 和完整的错误回溯:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_37 (InputLayer)       [(None, 59)]              0         
                                                                 
 dense_36 (Dense)            (None, 3)                 180       
                                                                 
=================================================================
Total params: 180
Trainable params: 180
Non-trainable params: 0
_________________________________________________________________
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-123-0216109227fb> in <module>
     15 lr = lr_model()
     16 lr.summary()
---> 17 lr.fit(x_train, y_train, batch_size=181, epochs=500, verbose=0)

7 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/training_v1.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    794         max_queue_size=max_queue_size,
    795         workers=workers,
--> 796         use_multiprocessing=use_multiprocessing)
    797 
    798   def evaluate(self,

/usr/local/lib/python3.7/dist-packages/keras/engine/training_generator_v1.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, **kwargs)
    775         shuffle=shuffle,
    776         initial_epoch=initial_epoch,
--> 777         steps_name='steps_per_epoch')
    778 
    779   def evaluate(self,

/usr/local/lib/python3.7/dist-packages/keras/engine/training_generator_v1.py in model_iteration(model, data, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch, mode, batch_size, steps_name, **kwargs)
    250 
    251       is_deferred = not model._is_compiled
--> 252       batch_outs = batch_function(*batch_data)
    253       if not isinstance(batch_outs, list):
    254         batch_outs = [batch_outs]

/usr/local/lib/python3.7/dist-packages/keras/engine/training_v1.py in train_on_batch(self, x, y, sample_weight, class_weight, reset_metrics)
   1061           y,
   1062           sample_weights=sample_weights,
-> 1063           output_loss_metrics=self._output_loss_metrics)
   1064       outputs = (output_dict['total_loss'] + output_dict['output_losses']
   1065                  + output_dict['metrics'])

/usr/local/lib/python3.7/dist-packages/keras/engine/training_eager_v1.py in train_on_batch(model, inputs, targets, sample_weights, output_loss_metrics)
    310           sample_weights=sample_weights,
    311           training=True,
--> 312           output_loss_metrics=output_loss_metrics))
    313   if not isinstance(outs, list):
    314     outs = [outs]

/usr/local/lib/python3.7/dist-packages/keras/engine/training_eager_v1.py in _process_single_batch(model, inputs, targets, output_loss_metrics, sample_weights, training)
    245       ValueError: If the model has no loss to optimize.
    246   """
--> 247   with backend.eager_learning_phase_scope(1 if training else 0), \
    248       training_utils.RespectCompiledTrainableState(model):
    249     with GradientTape() as tape:

/usr/lib/python3.7/contextlib.py in __enter__(self)
    110         del self.args, self.kwds, self.func
    111         try:
--> 112             return next(self.gen)
    113         except StopIteration:
    114             raise RuntimeError("generator didn't yield") from None

/usr/local/lib/python3.7/dist-packages/keras/backend.py in eager_learning_phase_scope(value)
    590   global _GRAPH_LEARNING_PHASES  # pylint: disable=global-variable-not-assigned
    591   assert value in {0, 1}
--> 592   assert tf.compat.v1.executing_eagerly_outside_functions()
    593   global_learning_phase_was_set = global_learning_phase_is_set()
    594   if global_learning_phase_was_set:

The way you import data as dataframe and If you are right that the dataset contains 59 features as you mentioned above;将数据导入为 dataframe 的方式,如果你是对的,数据集包含上面提到的 59 个特征; i will think it only requires to change the code here, where you slice/split dataset for training and testing;我认为它只需要在这里更改代码,您可以在其中切片/拆分数据集以进行训练和测试;

x_train,y_train = X[:idx,:], y[:idx]
x_test, y_test = X[idx+1:,:], y[idx+1:]

and should be changed into应该改成

x_train,y_train = X[:idx], y[:idx]
x_test, y_test = X[idx+1:], y[idx+1:]

Or或者

x_train,y_train = X.iloc[:idx,:], y[:idx]
x_test, y_test = X.iloc[idx+1:,:], y[idx+1:]

But notice, as you know, you are trying to scale your dataset in the way that occur data leakage, cause of using pipline for whole dataset;但是请注意,如您所知,您正在尝试以发生数据泄漏的方式扩展数据集,这是对整个数据集使用管道的原因; While you should fit_transform for training set and then transform for testing set.虽然您应该为训练集 fit_transform ,然后为测试集进行转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM