One-Hot 编码的 Keras 自定义损失

Question

I currently have a DNN I trained that makes a prediction of a one-hot encoded classification for states that a game is in. Essentially, imagine there are three states, 0, 1, or 2.我目前有一个我训练过的 DNN，它可以预测游戏所处状态的单热编码分类。基本上，假设有0, 1, or 2.三种状态0, 1, or 2.

Now, I normally would use categorical_cross_entropy for the loss function, but I realized not all classifications are not equal for my states.现在，我通常会使用categorical_cross_entropy作为损失函数，但我意识到并非所有分类对于我的状态都不同。 For example:例如：

If the model predicts it should be state 1, there is no cost to my system if that classification is wrong, since state 1 is basically do nothing, so reward 0x.如果模型预测它应该是状态 1，那么如果分类错误，我的系统就没有任何成本，因为状态 1 基本上是什么都不做，所以奖励 0x。
If the model correctly predict states 0 or 2 (ie predict = 2 and correct = 2), then that reward should be 3x.如果模型正确预测状态 0 或 2（即预测 = 2 且正确 = 2），则该奖励应为 3 倍。
If the model incorrectly predict states 0 or 2 (ie predict = 2 and correct = 0), then that reward should be -1x.如果模型错误地预测状态 0 或 2（即预测 = 2 且正确 = 0），则该奖励应为 -1x。

I know that we can declare our custom loss functions in Keras but I keep on getting stuck with forming it.我知道我们可以在 Keras 中声明我们的自定义损失函数，但我一直坚持要形成它。 Anyone have suggestions how to transform that pseudo code?有人对如何转换该伪代码有建议吗？ I can't tell how I'd do that in a vector-wise operation.我不知道如何在向量操作中做到这一点。

Additional question : I essentially am after a reward function I think.附加问题：我认为我基本上是在追求奖励功能。 Is this the same as a loss function?这和损失函数一样吗？ Thanks!谢谢！

def custom_expectancy(y_expected, y_pred):
    
    # Get 0, 1 or 2
    expected_norm = tf.argmax(y_expected);
    predicted_norm = tf.argmax(y_pred);
    
    # Some pseudo code....
    # Now, if predicted == 1
    #     loss += 0
    # elif predicted == expected
    #     loss -= 3
    # elif predicted != expected
    #     loss += 1
    #
    # return loss

Sources consulted:参考资料来源：

https://datascience.stackexchange.com/questions/55215/how-do-i-create-a-keras-custom-loss-function-for-a-one-hot-encoded-binary-classi https://datascience.stackexchange.com/questions/55215/how-do-i-create-a-keras-custom-loss-function-for-a-one-hot-encoded-binary-classi

Custom loss in Keras with softmax to one-hot 使用 softmax 将 Keras 中的自定义损失设置为 one-hot

Code Update代码更新

import tensorflow as tf
def custom_expectancy(y_expected, y_pred):
    
    # Get 0, 1 or 2
    expected_norm = tf.argmax(y_expected);
    predicted_norm = tf.argmax(y_pred);
    
    results = tf.unstack(expected_norm)
    
    # Some pseudo code....
    # Now, if predicted == 1
    #     loss += 0
    # elif predicted == expected
    #     loss += 3
    # elif predicted != expected
    #     loss -= 1
    
    for idx in range(0, len(expected_norm)):
        predicted = predicted_norm[idx]
        expected = expected_norm[idx]
        
        if predicted == 1: # do nothing
            results[idx] = 0.0
        elif predicted == expected: # reward
            results[idx] = 3.0
        else: # wrong, so we lost
            results[idx] = -1.0
    
    
    return tf.stack(results)

I think this is what I'm after, but I haven't quite figured out how to build the correct tensor (which should be of size batch) to return.我认为这就是我所追求的，但我还没有完全弄清楚如何构建正确的张量（应该是批量大小）以返回。

Answer 1

The best way to build a conditional custom loss is to use tf.keras.backend.switch without involving loops.构建条件自定义损失的最佳方法是使用tf.keras.backend.switch而不涉及循环。

In your case, you should combine 2 switch conditional expressions to obtain the desired results.在您的情况下，您应该组合 2 个switch条件表达式以获得所需的结果。

The desired loss function can be reproduced in this way:可以通过这种方式重现所需的损失函数：

def custom_expectancy(y_expected, y_pred):
    
    zeros = tf.cast(tf.reduce_sum(y_pred*0, axis=-1), tf.float32) ### important to produce gradient
    y_expected = tf.cast(tf.reshape(y_expected, (-1,)), tf.float32)
    class_pred = tf.argmax(y_pred, axis=-1)
    class_pred = tf.cast(class_pred, tf.float32)
    
    cond1 = (class_pred != y_expected) & (class_pred != 1)
    cond2 = (class_pred == y_expected) & (class_pred != 1)
    
    res1 = tf.keras.backend.switch(cond1, zeros -1, zeros)
    res2 = tf.keras.backend.switch(cond2, zeros +3, zeros)
    
    return res1 + res2

Where cond1 is when the model incorrectly predicts states 0 or 2 and cond2 is when the model correctly predicts states 0 or 2. The standard states is zero that is returned when cond1 and cond2 are not activated.其中cond1是模型错误预测状态 0 或 2 时， cond2是模型正确预测状态 0 或 2 时。标准状态为零，当cond1和cond2未激活时返回。

You can notice that y_expected can be passed as a simple tensor/array of integer encoded states ( no need to one-hot them ).您可以注意到y_expected可以作为一个简单的张量/整数编码状态数组传递（不需要对它们进行单热）。

Here how the loss function works:损失函数的工作原理如下：

true = tf.constant([[1],    [2],    [1],    [0]    ])  ## no need to one-hot
pred = tf.constant([[0,1,0],[0,0,1],[0,0,1],[0,1,0]])

custom_expectancy(true, pred)

Which returns:返回：

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 0.,  3., -1.,  0.], dtype=float32)>

That seems to be consistent with our needs.这似乎符合我们的需求。

To use the loss inside a model:要使用模型内的损失：

X = np.random.uniform(0,1, (1000,10))
y = np.random.randint(0,3, (1000)) ## no need to one-hot

model = Sequential([Dense(3, activation='softmax')])
model.compile(optimizer='adam', loss=custom_expectancy)
model.fit(X,y, epochs=3)

Here the running notebook 这里是跑步笔记本

Answer 2

Here there is a nice post explaining the concepts of the loss function and cost function . 这里有一篇很好的帖子解释了损失函数和成本函数的概念。 Multiple answers illustrate how they are considered by different authors in the field of machine learning.多个答案说明了机器学习领域的不同作者如何考虑它们。

As for the loss function, you may find the following implementation useful .至于损失函数，您可能会发现以下实现很有用。 It implements a weighted cross-entropy loss, where you weigh each class proportionally to their weight in train.它实现了加权交叉熵损失，您可以根据训练中的权重按比例对每个类进行权重。 This could be adapted to satisfy the constraints specified above.这可以进行调整以满足上面指定的约束。

Answer 3

Here's how you want to do it.这是您想要的方法。 If your ground truth y_true is dense (shaped N3), you can use a tf.reduce_all(y_true == [0.0, 0.0, 1.0], axis=-1, keepdims=True) and tf.reduce_all(y_true == [1.0, 0.0, 0.0], axis=-1, keepdims=True) to control the if/elif/else.如果您的真实 y_true 是密集的（形状为 N3），您可以使用tf.reduce_all(y_true == [0.0, 0.0, 1.0], axis=-1, keepdims=True)和tf.reduce_all(y_true == [1.0, 0.0, 0.0], axis=-1, keepdims=True)来控制if/elif/else。 You could further optimize this with a tf.gather.您可以使用 tf.gather 进一步优化它。

def sparse_loss(y_true, y_pred):
  """Calculate loss for game. Follows keras loss signature.
  
  Args:
    y_true: Sparse tensor of shape N1, where correct prediction
      is encoded as 0, 1, or 2. 
    y_pred: Tensor of shape N3. For each row, the three columns
      represent the predicted probability of each state. 
      For example, [0.1, 0.4, 0.6] means, "There's a 10% chance the 
      right state is 0; 40% chance the right state is 1, 
      and 60% chance the right state is 2". 
  """

  # This is the unvectorized implementation on individual rows which is more
  # intuitive. But TF requires vectorization. 
  # if y_true == 0:
  #   # Value matrix is shape 3. Broadcasting will occur. 
  #   return -tf.reduce_sum(y_pred * [3.0, 0.0, -1.0])
  # elif y_true == 2:
  #   return -tf.reduce_sum(y_pred * [-1.0, 0.0, 3.0])
  # else:
  #   # According to the rules, this is never the correct
  #   # state the predict so it should never show up.
  #   assert False, f'Impossible state reached. y_true: {y_true}, y_pred: {y_pred}.'


  # We vectorize by calculating the reward for all predictions for two cases:
  # if y_true is zero or if y_true is two. To eliminate this inefficiency, we 
  # could us tf.gather to build an N3 shaped matrix to multiply against. 
  reward_for_true_zero = tf.reduce_sum(y_pred * [3.0, 0.0, -1.0], axis=-1, keepdims=True) # N1
  reward_for_true_two = tf.reduce_sum(y_pred * [-1.0 ,0.0, 3.0], axis=-1, keepdims=True) # N1

  reward = tf.where(y_true == 0.0, reward_for_true_zero, reward_for_true_one) # N1
  return -tf.reduce_sum(reward)

One-Hot 编码的 Keras 自定义损失

问题描述

3 个解决方案

解决方案1
1 2021-08-04 09:27:31

解决方案2
0 2021-08-04 03:35:45

解决方案3
0 2021-08-05 00:28:53

One-Hot 编码的 Keras 自定义损失

问题描述

3 个解决方案

解决方案1 1 2021-08-04 09:27:31

解决方案2 0 2021-08-04 03:35:45

解决方案3 0 2021-08-05 00:28:53

解决方案1
1 2021-08-04 09:27:31

解决方案2
0 2021-08-04 03:35:45

解决方案3
0 2021-08-05 00:28:53