可以在Keras中使用哪种模型（损失函数等）来对带有概率标签而不是一键编码的分类训练

Question

I came to a problem when designing my keras model. 设计我的keras模型时遇到问题。

The training data(input) to the model is 2 sequential character-encoded lists and a non-sequential normal feature list. 模型的训练数据（输入）是2个连续的字符编码列表和一个非连续的正常特征列表。 The output is a list of probabilities of 5 different classes. 输出是5种不同类别的概率的列表。 The testing data has the same features while the output is a single class label instead of probability. 测试数据具有相同的功能，而输出是单个类别标签，而不是概率。 The task is to build a model learning from the training probability to predict the actual class on testing data. 任务是从训练概率中建立模型学习，以预测测试数据上的实际类别。

For example, the data looks like 例如，数据看起来像

X_train, X_test = Sequential feature 1, Sequential feature 2, Non-sequential feature 3  
y_train = probability for class 1, probability for class 2 ... , probability for class 5  
y_test = 0/1, 0/1, ..., 0/1

X_train, X_test = [0, 0, 0, 11, 21, 1] + [ 0, 0, 0, 0, 0, 121, 1, 16] + [1, 0, 0.543, 0.764, 1, 0, 1]  
y_train = [0.132561  , 0.46975598, 0.132561  , 0.132561  , 0.132561]  
y_test = [0, 1, 0, 0, 0]

I have built two CNN model for sequential data, and a normal dense layer for non-sequential data, concat them into one-mixed model with some dense layers and dropouts. 我建立了两个用于顺序数据的CNN模型，以及一个用于非顺序数据的普通密集层，将它们合并为具有一些密集层和缺失的一个混合模型。 I used categorical_crossentropy as my loss function, while my input is not strictly one-hot encoding. 我将categorical_crossentropy用作损失函数，而我的输入严格来说不是单热编码。 Will that be a problem? 那会是个问题吗？ Is there any suggestion to improve the model? 是否有任何改进模型的建议？

PS: taking the argmax of the training probability is not always telling the truth of actual label, say a list of probability PS：取训练概率的argmax并不总是说出实际标签的真相，而是说出一个概率列表

[0.33719498  , 0.46975598, 0.06434968  , 0.06434968  , 0.06434968]

the actual label could be 实际的标签可能是

[1, 0, 0, 0, 0]

Answer 1

Using probabilistic labels as ground truths seem not to be a good idea. 将概率标签用作基本事实似乎不是一个好主意。 We assume the data drawn from a fixed distribution. 我们假设数据来自固定分布。 After being drawn, they are fixed events. 绘制后，它们是固定事件。

It seems to violate the assumption of the learning problems from a theoretical view. 从理论角度看，这似乎违反了学习问题的假设。

I would suggest converting from probabilistic labels to one-hot labels and see if you experience an improvement. 我建议从概率标签转换为一键式标签，看看您是否有所改善。

可以在Keras中使用哪种模型（损失函数等）来对带有概率标签而不是一键编码的分类训练

问题描述

1 个解决方案

解决方案1
0 2019-03-02 22:20:42

可以在Keras中使用哪种模型（损失函数等）来对带有概率标签而不是一键编码的分类训练

问题描述

1 个解决方案

解决方案1 0 2019-03-02 22:20:42

解决方案1
0 2019-03-02 22:20:42