标签大小与target_names不同：Tensorflow多输入回归转换为分类

Question

I am trying to convert a multi-input mixed input (txt, image) keras model from a regression output (house price) to a classification output (number of bedrooms). 我正在尝试将多输入混合输入（txt，图像）keras模型从回归输出（房屋价格）转换为分类输出（卧室数量）。 In particular, I am altering this tutorial 特别是，我正在更改本教程

https://www.pyimagesearch.com/2019/02/04/keras-multiple-inputs-and-mixed-data/ https://www.pyimagesearch.com/2019/02/04/keras-multiple-inputs-and-mixed-data/

to be a classifier. 成为分类器。 I have a couple of technical questions about the number of categories, and I also get an error that I don't fully understand. 关于类别的数量，我有几个技术问题，而且我还得到一个我不完全理解的错误。

I have altered the last layer of the network to be a softmax: 我将网络的最后一层更改为softmax：

x = Dense(11, activation="softmax")(x)

However I only have 10 categories (the dataset covers houses with 1-10 bedrooms). 但是我只有10个类别（数据集涵盖1-10个卧室的房屋）。 But with Dense(10,...) I get the following error: 但是使用Dense（10，...）我得到以下错误：

InvalidArgumentError: Received a label value of 10 which is outside >the valid range of [0, 10). InvalidArgumentError：接收到的标签值10大于有效范围[0，10）。 Label values: 3 2 5 2 10 3 2 5 标签值：3 2 5 2 10 3 2 5

I understand the error, and how to avoid it, but why isn't the range [0,10) sufficient given that I don't have houses with 0 bedrooms? 我理解该错误以及如何避免该错误，但是考虑到我没有带0个卧室的房屋，为什么[0,10）的范围还不够？

When I try and get a classification report I get two warnings: 当我尝试获取分类报告时，会收到两个警告：

UserWarning: labels size, 6, does not match size of target_names, 10 UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. UserWarning：标签大小为6，与target_names大小不匹配，为10 UndefinedMetricWarning：精度和F分数定义不明确，并且在没有预测样本的标签中设置为0.0。

I think these might be because my classification report only contains houses with 1-6 bedrooms. 我认为这些可能是因为我的分类报告仅包含1-6间卧室的房屋。 But am not sure - any insight you can give would be appreciated. 但不确定-您能提供的任何见解将不胜感激。

My code and the dataset can be cloned from here: https://github.com/davidrtfraser/blog-keras-multi-input 我的代码和数据集可以从这里克隆： https : //github.com/davidrtfraser/blog-keras-multi-input

Answer 1

Generally in Machine Learning, labels for a N classes are encoded as integers in the range 0 to N - 1, because this maps directly from class indices, so you can use argmax to recover them from model outputs. 通常在机器学习中，N个类的标签被编码为0到N-1范围内的整数，因为这直接从类索引映射，因此您可以使用argmax从模型输出中恢复它们。

So you need to encode your labels in the same way, the easiest way is to substract your [1, 10] labels to [0, 9] by substracting one from each label, and to get the number of bedrooms from the model output, you add one to the output label. 所以，你需要以同样的方式编码的标签，最简单的方法就是。减去你的[1, 10]标签[0, 9]由每个标签从其减去一个，并从模型输出得到卧室的数量，您将一个添加到输出标签。

标签大小与target_names不同：Tensorflow多输入回归转换为分类

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-07-14 21:12:32

标签大小与target_names不同：Tensorflow多输入回归转换为分类

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-07-14 21:12:32

解决方案1
1 已采纳 2019-07-14 21:12:32