为什么Tensorflow中二元分类头的logit_dimension=1？

Question

I am having a hard time understanding how the binary classification head works in Tensorflow.我很难理解二进制分类头在 Tensorflow 中是如何工作的。 I am trying to create a custom multi-head Estimator in Tensorflow.我正在尝试在 Tensorflow 中创建一个自定义的多头估算器。 My code looks like the following:我的代码如下所示：

def model_fn_multihead(features, labels, mode, params):
    # Create simple heads and specify head name.
    head_target_0 = tf.contrib.estimator.binary_classification_head(name=target_0)

    head_target_1 = tf.contrib.estimator.multi_class_head(n_classes=3, name=target_1)

    # Create multi-head from two simple heads.
    head = tf.contrib.estimator.multi_head([head_target_0, head_target_1])
    # Create logits for each head, and combine them into a dict.

    net = tf.feature_column.input_layer(features, params['feature_columns'])
    for idx, units in enumerate(params['hidden_units']):
        net = tf.layers.dense(net, units=units, activation=tf.nn.relu, name='fully_connected_%d' % idx)

    # Compute logits (1 per class).
    logits_0 = tf.layers.dense(net, 2, activation=None, name='logits_0')
    logits_1 = tf.layers.dense(net, 3, activation=None, name='logits_1')

    logits = {target_0: logits_0, target_1: logits_1}

    def _train_op_fn(loss):
        return tf.train.AdagradOptimizer(learning_rate=0.01).minimize(loss, global_step=tf.train.get_global_step())    

    return head.create_estimator_spec(features=features, labels=labels, mode=mode, logits=logits, train_op_fn=_train_op_fn)

The problem is if I run the code as is, Tensorflow complains that logits_0 has the wrong dimensions, if I dig into the source code at tensorflow\\contrib\\estimator\\python\\estimator\\multi_head.py , it is expecting the logits dimension of "1' for "logits_0", but the clearly in a binary classifier there are two classes. What's going on? If I set the dimension to "1", the code will run but I will always get non-sensical results in the training. Basically the classifier can't learn the difference between a 1/0 target even with a single trivial feature.问题是，如果我按原样运行代码，Tensorflow 会抱怨logits_0的维度有误，如果我深入研究tensorflow\\contrib\\estimator\\python\\estimator\\multi_head.py的源代码，它会期望 logits 维度为“ 1' 表示“logits_0”，但在二元分类器中显然有两个类。这是怎么回事？如果我将维度设置为“1”，代码将运行，但在训练中我总是会得到无意义的结果。基本上，即使使用单个微不足道的特征，分类器也无法学习 1/0 目标之间的差异。

This code works perfectly for multiple, multi-class heads (n_class>2).此代码适用于多个多类头 (n_class>2)。

I am using Tensorflow 1.4.我正在使用 Tensorflow 1.4。 Am I simply misunderstanding something?我只是误解了什么吗？ Perhaps my input is formatted incorrectly?也许我的输入格式不正确？

Update:更新：

I figured out what the problem is, which is that Tensorflow is expecting a tensor of type "bool", it is not enough to submit labels of 1, 0, 0, 1, etc, wrapping the label with tf.equal(label, 1) solved the issue.我想出了问题所在，即 Tensorflow 需要一个“bool”类型的张量，提交 1、0、0、1 等标签是不够的，用 tf.equal(label, 1）解决了这个问题。 Now I understand why the logits_dimension is 1. However, this still does not solve my actual problem.现在我明白为什么 logits_dimension 是 1。但是，这仍然不能解决我的实际问题。 Which is that the binary classifier just doesn't seem to be working when wrapped in a multi_head.也就是说，当包装在 multi_head 中时，二元分类器似乎不起作用。 The classification results are just always wrong.分类结果总是错误的。

If we submit a simple trivial example involved a single categorical variable called: CAT_XXX where XXX is a number between 1 and 100. If we construct two target variables;如果我们提交一个简单的例子，涉及一个名为：CAT_XXX 的单一分类变量，其中 XXX 是 1 到 100 之间的数字。如果我们构造两个目标变量；

Target_2: 0 if XXX%2==0, 1 if XXX%3==0, else 2 Target_2：如果 XXX%2==0，则为 0，如果 XXX%3==0，则为 1，否则为 2
Target_3: 0 if XXX%2==0, else 1 Target_3：如果 XXX%2==0，则为 0，否则为 1

we can construct a trivial multi-headed, multi-classification problem.我们可以构造一个简单的多头、多分类问题。 In such a scenario, I obtain results like:在这种情况下，我得到的结果如下：

accuracy/Target_2: 1.0
accuracy/Target_3: 0.600072
accuracy_baseline/Target_3: 0.600072
auc/Target_3: 0.497585
auc_precision_recall/Target_3: 0.399472
average_loss/Target_2: 0.000260735
average_loss/Target_3: 0.673509
global_step: 11720
label/mean/Target_3: 0.399928
loss: 21.5472
prediction/mean/Target_3: 0.399628

you can see the multi-class target has been perfectly predicted but the binary problem is nonsense.您可以看到多类目标已被完美预测，但二元问题是无稽之谈。 The thing is the binary_classification head works fine as a standalone input to DNNEstimator.问题是 binary_classification 头作为 DNNEstimator 的独立输入可以正常工作。 It's just when it is wrapped in a multi_head things seem to go wrong.只是当它被包裹在一个 multi_head 中时，事情似乎出了问题。

Kuhan库汉

Answer 1

The binary_classification_head() has the logit dimension of "1", because the probability of a binary classification is [alpha, 1-alpha] for the two classes. binary_classification_head() 的 logit 维度为“1”，因为对于两个类，二元分类的概率为 [alpha, 1-alpha]。

According to the tensorflow binary_classification_head doc , if 'label_vocabulary' is not given, labels must be float Tensor with values in the interval [0, 1].根据tensorflow binary_classification_head doc ，如果未给出 'label_vocabulary'，则标签必须是值在 [0, 1] 区间内的浮点张量。 However, I tested the head with wrapping the labels to boolean (your suggestion) and casting the labels to integer or floats and they give same results.但是，我通过将标签包装为布尔值（您的建议）并将标签转换为整数或浮点数来测试头部，并且它们给出相同的结果。 Probably tensorflow is doing an additional cast in the background.可能 tensorflow 正在后台进行额外的转换。

Since the binary_classification_head() is working for the isolated case and not in the combination with multi_head(), I would recommend you to experiment with your hyperparameters (especially the learning rate).由于 binary_classification_head() 适用于孤立的情况，而不是与 multi_head() 结合使用，因此我建议您尝试使用超参数（尤其是学习率）。 Have in mind that you change the calculation of the loss with the combination of heads and therefore you need to adjust your hyperparameters of your model.请记住，您使用正面组合更改了损失的计算，因此您需要调整模型的超参数。 Maybe a good start would be to pass head_weights into the multi_head() with your old set of hyperparameters and see how your model reacts (just a thought).也许一个好的开始是将 head_weights 传递到带有旧超参数集的 multi_head() 并查看模型的反应（只是一个想法）。

为什么Tensorflow中二元分类头的logit_dimension=1？

问题描述

1 个解决方案

解决方案1
0 2019-10-29 09:25:39

为什么Tensorflow中二元分类头的logit_dimension=1？

问题描述

1 个解决方案

解决方案1 0 2019-10-29 09:25:39

解决方案1
0 2019-10-29 09:25:39