简体繁体 English

在训练用于语义分割的深度学习模型时，处理背景像素类 (ignore_label) 的最佳方法是什么？

[英]What is the best way to handle the background pixel classes (ignore_label), when training deep learning models for semantic segmentation?

原文 2019-11-28 06:53:15 7 2 tensorflow/ machine-learning/ deep-learning/ pytorch/ semantic-segmentation

I am trying to train a UNET model on the cityscapes dataset which has 20 'useful' semantic classes and a bunch of background classes that can be ignored (ex. sky, ego vehicle, mountains, street lights).我正在尝试在cityscapes数据集上训练一个UNET模型，它有 20 个“有用”的语义类和一堆可以忽略的背景类（例如天空、自我车辆、山脉、路灯）。 To train the model to ignore these background pixels I am using the following popular solution on the internet:为了训练模型忽略这些背景像素，我使用了以下互联网上流行的解决方案：

I assign a common ignore_label (ex: ignore_label=255 ) for all the pixels belonging to the ignore classes我为属于忽略类的所有像素分配一个通用的ignore_label （例如： ignore_label=255 ）
Train the model using the cross_entropy loss for each pixel prediction使用每个像素预测的cross_entropy损失来训练模型
Provide the ignore_label parameter in the cross_entropy loss, therefore the loss computed ignores the pixels with the unnecessary classes.在cross_entropy损失中提供ignore_label参数，因此计算的损失会忽略具有不必要类别的像素。

But this approach has a problem.但是这种方法有一个问题。 Once trained, the model ends up classifying these background pixels as belonging to one of the 20 classes instead.训练完成后，该模型最终会将这些背景像素分类为属于 20 个类别之一。 This is expected as in the loss we do not penalize the model for whatever classification it makes for the background pixels.这是预期的，因为在损失中，我们不会因为模型对背景像素所做的任何分类而对其进行惩罚。

The second obvious solution is therefore to use a extra class for all the background pixels.因此，第二个明显的解决方案是为所有背景像素使用一个额外的类。 Therefore it is the 21st class in cityscapes.因此它是城市景观中的第 21 级。 However, here I am worried that I will 'waste' my model's capacity by teaching it to classify this additional unnecessary class.然而，在这里我担心我会“浪费”我的模型的容量，因为我会教它对这个额外的不必要的类进行分类。

What is the most accurate way of handling the background pixel classes?处理背景像素类的最准确方法是什么？

2 个解决方案

Definitely the second solution is the better one.绝对是第二种解决方案更好。 This is the best solution, the background class is definitely and additional class but not an unnecessary one, since in this way there is a clear differentiation between the classes you want to detect and the background.这是最好的解决方案，背景类绝对是附加类，但不是不必要的类，因为这样可以清楚地区分要检测的类和背景。

In fact, this is a standard procedure recommended in segmentation, to assign a class to a background, where background of course represents everything else apart from your specific classes.事实上，这是在分割中推荐的标准程序，将一个类别分配给一个背景，其中背景当然代表除您的特定类别之外的所有其他内容。

May be you can try using "Dice loss + Inverted Dice loss" which takes into account both foreground and background pixels也许你可以尝试使用“骰子损失+倒骰子损失”，它同时考虑了前景和背景像素