簡體   English   中英

無法從 Keras model 展平 output

[英]Can't flatten output from Keras model

我有以下用 Keras 構建的 model,我正在使用 StratifiedKFold 對其進行訓練。 訓練效果很好,性能很好。 現在我嘗試使用 SHAP 庫解釋 model 預測。 我的日期集形狀是 (107012, 67),下面是我編寫的代碼,用於對我的數據進行編碼、訓練和進行預測。 original_X是我正在使用 Pandas 讀取數據的變量。 我的大部分數據都是分類數據,只有一列包含連續值。

ohe = OneHotEncoder()
mms = MinMaxScaler()

ct = make_column_transformer(
    (ohe, categorical_columns_encode),
    (mms, numerical_columns_encode),
    remainder='passthrough')

ct.fit(original_X.astype(str))
X = ct.transform(original_X.astype(str))
print(X.shape) # Shape of the encoded value (107012, 47726)

recall = Recall(name="recall")
prec = Precision(name="precision")
ba = BinaryAccuracy()

def get_model():
  network = Sequential()
  network.add(Input(shape=X_1.shape))
  network.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
  network.add(Dropout(0.5))
  network.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
  network.add(Dropout(0.5))
  network.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
  # network.add(Flatten())
  network.add(Dense(1, activation='sigmoid'))

  network.compile(loss='binary_crossentropy',
              optimizer=Adam(learning_rate=0.001),
              metrics=[recall, prec, ba])
  return network

classifier = KerasClassifier(build_fn=get_model)
kfold = RepeatedStratifiedKFold(n_splits=3, n_repeats=3, random_state=42)

callback = EarlyStopping(
    monitor='val_recall',
    min_delta=0,
    patience=0,
    verbose=1,
    mode="auto",
    baseline=None,
    restore_best_weights=True
)

epochs_per_fold = []

for train, validation in kfold.split(X_1, y_1):
  X_train, X_validation = X_1[train], X_1[validation]
  y_train, y_validation = y_1[train], y_1[validation]

  # Printing the distribution of classes in the training set
  counter = Counter(y_train)
  print("Number of class distributions of the training set ", counter)
  print("Minority case percentage of the training set ", counter[1] / (counter[0] + counter[1]))
  
  # Training our model and saving the history of the training
  history = classifier.fit(
    x=X_train,
    y=y_train,
    verbose=1,
    epochs=30,
    shuffle=True,
    callbacks=[callback],
    class_weight={0: 1.0, 1: 3.0},
    validation_data=(X_validation, y_validation))

  # predict classes for our validation set in order to manually verify the metrics
  yhat_classes = (classifier.predict(X_validation) > 0.5).astype("int32")

  TP = 0
  FP = 0
  TN = 0
  FN = 0

  # Record our preditions for the confusion matrix for manually verifying our metrics
  for p,t in zip(y_validation, yhat_classes):
    if p == 1 and t == 1:
      TP += 1
    elif p == 0 and t == 1:
      FP += 1
    elif p == 1 and t == 0:
      FN += 1
    elif p == 0 and t == 0:
      TN += 1
  
  print("\n")
  print(" "*16, "T  F")
  print("Positive result ", TP, FP, )
  print("Negative result ", TN, FN, )
  print("\n")

  # Printing the built in classification report of our model
  print(classification_report(y_validation, yhat_classes))

  report_dict = classification_report(y_validation, yhat_classes, output_dict=True)

  # Record the average number of epochs of training
  epochs_per_fold.append(len(history.history['recall']))
  print(yhat_classes)

在這里,我嘗試使用 Shap libeary 中的 DeepExplainer 來查看我的預測。

# we use the first 100 training examples as our background dataset to integrate over
background = X_2[np.random.choice(X_2.shape[0], 100, replace=False)]

explainer = shap.DeepExplainer(get_model(), background)

當代碼到達解釋器聲明時,將引發以下錯誤。

Your TensorFlow version is newer than 2.4.0 and so graph support has been removed in eager mode. See PR #1483 for discussion.
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-113-d24b2d1e3b91> in <module>()
----> 1 explainer = shap.DeepExplainer(get_model(), background)

1 frames
/usr/local/lib/python3.7/dist-packages/shap/explainers/_deep/deep_tf.py in __init__(self, model, data, session, learning_phase_flags)
    100         self.model_output = _get_model_output(model)
    101         assert type(self.model_output) != list, "The model output to be explained must be a single tensor!"
--> 102         assert len(self.model_output.shape) < 3, "The model output must be a vector or a single value!"
    103         self.multi_output = True
    104         if len(self.model_output.shape) == 1:

AssertionError: The model output must be a vector or a single value!

我的問題是:

  1. 如何從 get_model function 中展平我的 model 的get_model
  2. 有沒有更好的方法來解釋我對 Shap 的預測?

如果我需要分享任何額外信息,請告訴我。

Dense圖層之后添加Flatten圖層會導致錯誤。 請注意,導致錯誤的行是,

assert len(self.model_output.shape) < 3, "The model output must be a vector or a single value!"        

考慮到 2D 輸入, Dense層的 output 為( None, units ) 因此,如果我們有一個Dense( 32 )層並且批量大小設置為 16,那么該層的 output 將是一個形狀為( 16, 32 )的張量。 Flatten層保留了第 0 軸(即批次維度),因此形狀( 16, 32 )的張量可以進一步展平。

另一方面,如果您有一個形狀為( 16, 32, 3 )的張量(例如,帶有 3 個過濾器的Conv2D層的 output ),那么Flatten層的 output 將是一個形狀為( 16, 96 )的張量.

由於您有 2D 輸入,因此只需刪除Flatten層。 如果您嘗試重塑 output,請改用Reshape層。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM