[英]How can I compute class weights for an output that has 4 neurons with keras?
I've seen how to do some class weight imbalance correction for a single classification. 我已经看过如何针对单一分类进行一些体重不平衡校正。 But in my case, my output layer is: 但就我而言,我的输出层是:
model.add(Dense(4, activation='sigmoid'))
My target
is a DataFrame
that has: 我的target
是一个DataFrame
,它具有:
0 1 2 3
0 1 1 0 0
1 0 0 0 0
2 1 1 1 0
3 1 1 0 0
4 1 1 0 0
5 1 1 0 0
6 1 0 0 0
... .. .. .. ..
14989 1 1 1 1
14990 1 1 1 0
14991 1 1 1 1
14992 1 1 1 0
[14993 rows x 4 columns]
My predictions can take the shape of one of 5 possible values: 我的预测可以采用5种可能值之一的形状:
[[0, 0, 0, 0],
[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 1, 1, 0],
[1, 1, 1, 1]]
However, those classes certainly are not balanced. 但是,这些课程当然不平衡。 I've seen how to computer the class weights if I have 1 target output with a softmax
, but this is slightly different. 我已经看过如果我有1个带有softmax
目标输出 ,如何计算类权重 ,但这略有不同。
Specifically, 特别,
model.fit(..., class_weights=weights)
How can I define weights
in this case? 在这种情况下如何定义weights
?
IMO you should use almost standard categorical_crossentropy
and output logits from the network which will be mapped in loss function to values [0,1,2,3,4]
using argmax
operation (same procedure will be applied to one-hot-encoded
labels, see last part of this answer for an example). IMO你应该使用几乎标准的categorical_crossentropy
并输出来自网络的logits,它将使用argmax
操作在loss函数中映射到值[0,1,2,3,4]
(相同的过程将应用于one-hot-encoded
标签,看一个例子的答案的最后一部分)。
Using weighted crossentropy
you can treat incorrectness differently based on the predicted vs correct
values as you said you indicated in the comments. 使用加权crossentropy
您可以根据您在评论中指出的predicted vs correct
值的不同来区别对待不predicted vs correct
。
All you have to do is to take absolute value of subtracted correct and predicted value and multiply it by loss , see example below: 您所要做的就是获取减去的正确值和预测值的绝对值,并将其乘以损失 ,请参见下面的示例:
Let's map each encoding to it's unary value (can be done using argmax
as later seen): 让我们将每个编码映射到它的一元值(可以使用argmax
完成,如下所示):
[0, 0, 0, 0] -> 0
[1, 0, 0, 0] -> 1
[1, 1, 0, 0] -> 2
[1, 1, 1, 0] -> 3
[1, 1, 1, 1] -> 4
And let's make some random targets and predictions by the model to see the essence: 让我们通过模型制作一些随机目标和预测,看看其本质:
correct predicted with Softmax
0 0 4
1 4 3
2 3 3
3 1 4
4 3 1
5 1 0
Now, when you subtract correct
and predicted
and take absolute you essentially get weighting column like this: 现在,当你减去correct
和predicted
并采取绝对时,你基本上得到这样的加权列:
weights
0 4
1 1
2 0
3 3
4 2
5 1
As you can see, prediction of 0
while true target is 4
will be weighted 4 times more than prediction of 3
with the same 4
target and that is what you want essentially IIUC. 正如您所看到的,预测0
而真实目标是4
将比使用相同4
目标的3
预测加权4倍,这就是您想要的基本上IIUC。
As Daniel Möller indicates in his answer I would advise you to create a custom loss function as well but a little simpler: 正如DanielMöller在他的回答中指出的那样,我建议你创建一个自定义丢失功能,但更简单一点:
import tensorflow as tf
# Output logits from your network, not the values after softmax activation
def weighted_crossentropy(labels, logits):
return tf.losses.softmax_cross_entropy(
labels,
logits,
weights=tf.abs(tf.argmax(logits, axis=1) - tf.argmax(labels, axis=1)),
)
And you should use this loss in your model.compile
as well, I think there is no need to reiterate points already made. 你应该在你的model.compile
使用这个损失,我认为没有必要重申已经提出的观点。
+inf/-inf
) 对于正确的预测,梯度将等于零,这意味着网络将更难加强连接(最大化/最小化对+inf/-inf
) [1, 0, 1, 0]
, there is no such with approach above. 您的模型无法预测不存在的分类值,例如,您可以预测[1, 0, 1, 0]
1,0,1,0]的多目标情况,上面没有这种方法。 Less degree of freedom would help it train and remove chances for nonsensical (if I got your problem description right) predictions. 较低的自由度将有助于培养和消除无意义的机会(如果我的问题描述正确)预测。 Additional discussion provided in the chat room in comments 聊天室在评论中提供了额外的讨论
Here is an example network with the custom loss function defined above. 以下是具有上面定义的自定义丢失功能的示例网络。 Your labels have to be one-hot-encoded
in order for it to work correctly. 您的标签必须是one-hot-encoded
才能正常工作。
import keras
import numpy as np
import tensorflow as tf
# You could actually make it a lambda function as well
def weighted_crossentropy(labels, logits):
return tf.losses.softmax_cross_entropy(
labels,
logits,
weights=tf.abs(tf.argmax(logits, axis=1) - tf.argmax(labels, axis=1)),
)
model = keras.models.Sequential(
[
keras.layers.Dense(32, input_shape=(10,)),
keras.layers.Activation("relu"),
keras.layers.Dense(10),
keras.layers.Activation("relu"),
keras.layers.Dense(5),
]
)
data = np.random.random((32, 10))
labels = keras.utils.to_categorical(np.random.randint(5, size=(32, 1)))
model.compile(optimizer="rmsprop", loss=weighted_crossentropy)
model.fit(data, labels, batch_size=32)
(Removed) First, you should fix your one-hot encoding: (删除)首先,你应该修复你的单热编码:
(Removed) pd.get_dummies(target) (删除)pd.get_dummies(目标)
Calculate each class weight by summing the amount of np.unique(target)
and divide by target.shape[0]
, getting proportions: 通过将np.unique(target)
的数量相加并除以target.shape[0]
计算每个类的权重,得到比例:
target=np.array([0 0 0 0], [1 0 0 0], [1 1 0 0], [1 1 1 0], [1 1 1 1])
proportion=[]
for i in range(0,len(target)):
proportion.append([i,len(np.where(target==np.unique(target)[i])[0])/target.shape[0]])
class_weight = dict(proportion)
model.fit(..., class_weights=class_weight)
Considering you have your targets (ground truth y) with shape (samples, 4)
, you can simply: 考虑到你有你的目标(基本事实y)与形状(samples, 4)
,你可以简单地:
positives = targetsAsNumpy.sum(axis=0)
totals = len(targetsAsNumpy)
negativeWeights = positives / totals
positiveWeights = 1 - negativeWeights
The class weights in the fit method are meant for categorical problems (only one correct class). 拟合方法中的类权重是针对分类问题(仅一个正确的类)。
I suggest you create a custom loss with these. 我建议你用这些创造一个自定义的损失。 Supposing you are using binary_crossentropy
. 假设您正在使用binary_crossentropy
。
import keras.backend as K
posWeightsK = K.constant(positiveWeights.reshape((1,4)))
negWeightsK = K.constant(negativeWeights.reshape((1,4)))
def weightedLoss(yTrue, yPred):
loss = K.binary_crossentropy(yTrue, yPred)
loss = K.switch(K.greater(yTrue, 0.5), loss * posWeigthsK, loss * negWeightsK)
return K.mean(loss) #optionally K.mean(loss, axis=-1) for further customization
Use this loss in the model: 在模型中使用此损失:
model.compile(loss = weightedLoss, ...)
For this value encoding (unary, also called 'thermometer code') you can simply measure the error on each value separately and add them, using eg binary_crossentropy or even mean squared / mean absolute error metric. 对于此值编码(一元,也称为“温度计代码”),您可以单独测量每个值的误差并添加它们,例如使用binary_crossentropy或甚至均方/平均绝对误差度量。 Given this output it's not really a classification problem, it's a discrete representation of a regression task; 鉴于此输出,它不是真正的分类问题,它是回归任务的离散表示; but such representations are effective in certain cases - eg as the paper Thermometer Encoding: One Hot Way To Resist Adversarial Examples describes. 但是这种表示在某些情况下是有效的 - 例如,纸张温度计编码:一种抵抗对抗性示例的热门方法 。
While such separate error measurements doesn't ensure that 'invalid' outputs (eg [1 0 0 0 1]) are impossible, they'll be very unlikely for any well-fit network, and it does have the property that, if the correct value is [1 1 1 1 0] then a prediction of [1 1 0 0 0] is "twice as wrong" as a prediction of [1 1 1 0 0]. 虽然这种单独的错误测量不能确保“无效”输出(例如[1 0 0 0 1 1])是不可能的,但它们对于任何适合的网络来说都是不太可能的,并且它确实具有如果正确值是[1 1 1 1 0]然后[1 1 0 0 0]的预测是“错误的两倍”而是[1 1 1 0 0]的预测。 And you don't need to adjust the 'class weights' to achieve these results. 而且您无需调整“类权重”即可实现这些结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.