简体繁体 English

在这种情况下，熵意味着什么？

[英]What does entropy mean in this context?

原文 2016-11-14 19:07:31 1 3 image-processing/ computer-vision/ image-segmentation/ entropy/ source-separation

I'm reading an image segmentation paper in which the problem is approached using the paradigm "signal separation", the idea that a signal (in this case, an image) is composed of several signals (objects in the image) as well as noise, and the task is to separate out the signals (segment the image). 我正在阅读一个图像分割纸，其中使用范式“信号分离”来解决问题，即信号（在这种情况下，图像）由几个信号（图像中的对象）以及噪声组成的概念，任务是分离信号（分割图像）。

The output of the algorithm is a matrix, 算法的输出是一个矩阵， $S \\ in R ^ {MxT}$ which represents a segmentation of an image into M components. 它表示将图像分割成M个分量。 T is the total number of pixels in the image, T是图像中的总像素数， $S_ {} IJ$ is the value of the source component (/signal/object) i at pixel j 是像素j处的源分量（/ signal / object）i的值

In the paper I'm reading, the authors wish to select a component m for 在我正在阅读的论文中，作者希望选择一个组件m $m \\ in [1，M]$ which matches certain smoothness and entropy criteria. 它符合某些平滑度和熵标准。 But I'm failing to understand what entropy is in this case. 但我在这种情况下无法理解熵是什么。

Entropy is defined as the following: 熵定义如下：

$H（s_m）= - \\ sum_ {n = 1} ^ {256} p_n（s_m）\\ cdot log_2（p_n（s_m）），m = 1，...，M$

and they say that '' 他们说''' ${P_N（S_M）} _ {N = 1} ^ {256}$ are probabilities associated with the bins of the histogram of 是与直方图的区间相关的概率 '' “”

The target component is a tumor and the paper reads: "the tumor related component 目标成分是肿瘤，论文写着：“肿瘤相关成分 with "almost" constant values is expected to have the lowest value of entropy." “几乎”常数值预计具有最低的熵值。“

But what does low entropy mean in this context? 但在这种情况下，低熵意味着什么呢？ What does each bin represent? 每个bin代表什么？ What does a vector with low entropy look like? 具有低熵的矢量看起来像什么？

link to paper 链接到纸张

3 个解决方案

They are talking about Shannon's entropy. 他们在谈论香农的熵。 One way to view entropy is to relate it to the amount of uncertainty about an event associated with a given probability distribution. 查看熵的一种方式是将其与关于与给定概率分布相关联的事件的不确定性量相关联。 Entropy can serve as a measure of 'disorder'. 熵可以作为“无序”的衡量标准。 As the level of disorder rises, the entropy rises and events become less predictable. 随着无序程度的提高，熵增加，事件变得不那么可预测。

Back to the definition of entropy in the paper: 回到论文中的熵定义：

H(s_m) is the entropy of the random variable s_m. H（s_m）是随机变量s_m的熵。 Here 这里 is the probability that outcome s_m happens. 是结果s_m发生的概率。 m are all the possible outcomes. m是所有可能的结果。 The probability density p_n is calculated using the gray level histogram, that is the reason why the sum runs from 1 to 256. The bins represent possible states. 使用灰度级直方图计算概率密度p_n，这是总和从1到256的原因。区间表示可能的状态。

So what does this mean? 那么这是什么意思？ In image processing entropy might be used to classify textures, a certain texture might have a certain entropy as certain patterns repeat themselves in approximately certain ways. 在图像处理中，熵可以用于对纹理进行分类，某些纹理可能具有某种熵，因为某些图案以近似某些方式重复自身。 In the context of the paper low entropy (H(s_m) means low disorder, low variance within the component m. A component with low entropy is more homogenous than a component with high entropy, which they use in combination with the smoothness criterion to classify the components. 在本文的背景下，低熵（H（s_m）意味着低无序，组分内的低方差m。具有低熵的组分比具有高熵的组分更均匀，它们与平滑性标准结合使用以进行分类组件。

Another way of looking at entropy is to view it as the measure of information content. 查看熵的另一种方法是将其视为信息内容的度量。 A vector with relatively 'low' entropy is a vector with relatively low information content. 具有相对“低”熵的向量是具有相对低信息含量的向量。 It might be [0 1 0 1 1 1 0]. 它可能是[0 1 0 1 1 1 0]。 A vector with relatively 'high' entropy is a vector with relatively high information content. 具有相对“高”熵的向量是具有相对高信息含量的向量。 It might be [0 242 124 222 149 13]. 它可能是[0 242 124 222 149 13]。

It's a fascinating and complex subject which really can't be summarised in one post. 这是一个迷人而复杂的主题，实际上无法在一篇文章中进行总结。

Entropy was introduced by Shanon (1948), were the higher value of Entropy = more detailed information. 熵是由Shanon（1948）引入的，熵的值越高=更详细的信息。 Entropy is a measure of image information content, which is interpreted as the average uncertainty of information source. 熵是图像信息内容的度量，其被解释为信息源的平均不确定性。 In Image, Entropy is defined as corresponding states of intensity level which individual pixels can adapt. 在图像中，熵被定义为各个像素可以适应的强度等级的对应状态。 It is used in the quantitative analysis and evaluation image details, the entropy value is used as it provides better comparison of the image details. 它用于定量分析和评估图像细节，使用熵值，因为它提供了更好的图像细节比较。

Perhaps, another way to think about entropy and information content in an image is to consider how much an image can be compressed. 也许，考虑图像中的熵和信息内容的另一种方式是考虑可以压缩多少图像。 Independent of the compression scheme (run length encoding being one of many), you can imagine a simple image having little information (low entropy) can be encoded with fewer bytes of data while completely random images (like white noise) cannot be compressed much, if at all. 独立于压缩方案（运行长度编码是众多编码之一），您可以想象一个具有少量信息（低熵）的简单图像可以用较少的字节数据编码，而完全随机的图像（如白噪声）不能被压缩很多，如果有的话。