简体   繁体   English

如何使用优化算法找到可能的最佳参数

[英]How to use an optimization algorithm to find the best possible parameter

I'm trying to find a good interval of colors for color masking in order to extract skin from images.我试图找到一个很好的 colors 间隔进行颜色遮罩,以便从图像中提取皮肤。

I have a database with images and masks to extract skin from those images.我有一个包含图像和蒙版的数据库,可以从这些图像中提取皮肤。 here's an example of a sample:这是一个示例:

示例图像

I'm applying the mask for each image in order to get something like this:我正在为每个图像应用蒙版以获得如下效果:

掩蔽样本结果

I'm getting all the pixels from all the masked images and removing the black pixels in order to keep only the pixels containing the skin.我从所有蒙面图像中获取所有像素并删除黑色像素,以便仅保留包含皮肤的像素。 Using this method I'm able to gather different pixels containing different shades of color of different skins from different people.使用这种方法,我能够从不同的人那里收集包含不同肤色的不同颜色的不同像素。

This is the code I'm using for this:这是我为此使用的代码:

for i, (img_color, img_mask) in enumerate ( zip(COLORED_IMAGES, MASKS) ) :

    # masking
    img_masked = cv2.bitwise_and(img_color, img_mask)
    
    # transforming into pixels array
    img_masked_pixels = img_masked.reshape(len(img_masked) * len(img_masked[0]), len(img_masked[0][0]))
 
    # merging all pixels from all samples
    if i == 0:
        all_pixels = img_masked_pixels
    else:
        all_pixels = np.concatenate((all_pixels, img_masked_pixels), axis = 0)

# removing black
all_pixels = all_pixels[ ~ (all_pixels == 0).all(axis = 1) ]

# sorting pixels
all_pixels = np.sort(all_pixels)

# reshape into 1 NB_PIXELSx1 image in order to create histogram
all_pixels = all_pixels.reshape(len(all_pixels), 1, 3)

# creating image NB_PIXELSx1 image containing all skin colors from dataset samples
all_pixels = cv2.cvtColor(all_pixels, cv2.COLOR_BGR2YCR_CB)

After extracting all shades of color from different skins, I'm creating a histogram that allows me to see which colors are more common.从不同皮肤中提取所有色调后,我正在创建一个直方图,让我可以查看哪些 colors 更常见。 The code is too long for the creation of the histogram, but this is the result:该代码对于创建直方图来说太长了,但结果如下:

在此处输入图像描述

Then, I use the turning point for each color space graph and chose a distance for that color space, say 20. The interval for that color space is gotten by doing [ turning point - 20, turning point +20 ]然后,我使用每个颜色空间图的转折点,并为该颜色空间选择一个距离,例如 20。该颜色空间的间隔是通过执行 [转折点 - 20,转折点 +20] 获得的

在此处输入图像描述

So let's say that we got the following:因此,假设我们得到了以下内容:

R: R:

  • turning point: 142转折点:142
  • distance: 61距离:61
  • interval: [81, 203]间隔:[81, 203]

G: G:

  • turning point: 155转折点:155
  • distance: 10距离:10
  • interval: [145, 165]间隔:[145, 165]

B:乙:

  • turning point: 109转折点:109
  • distance: 14距离:14
  • interval: [95, 123]间隔:[95, 123]

I would use these intervals in order to create masks of the colored image from the dataset in order to extract the skin (left: my intervals mask, right: ground truth mask):我将使用这些间隔从数据集中创建彩色图像的蒙版,以提取皮肤(左:我的间隔蒙版,右:真实蒙版):

在此处输入图像描述

The extracted masks using my intervals are compared with the dataset preexistent masks and the accuracy is calculated in order to see how effective and good the intervals that I got are:将使用我的间隔提取的掩码与数据集预先存在的掩码进行比较,并计算准确性,以查看我得到的间隔的有效性和良好程度:

precision_moy = 0
accuracy_moy = 0

for i, (image, img) in enumerate ( zip(COLORED, GROUND_TRUTH) ) :
    Min = np.array([81, 145, 95], np.uint8)
    Max = np.array([203, 165, 123], np.uint8)

    mask = cv2.inRange (image, Min, Max)

    TP = 0 # True Positive
    TN = 0 # True Negative
    FP = 0 # False Positive
    FN = 0 # False Negative

    for i in range(mask.shape[0]) :
        for j in range(mask.shape[1]) :
            if mask[i,j] == 255 and img[i,j,0] == 255:
                TP = TP + 1
            if mask[i,j] == 0 and img[i,j,0] == 0:
                TN = TN+1
            if mask[i,j] == 255 and img[i,j,0] == 0:
                FP = FP+1
            if mask[i,j] == 0 and img[i,j,0] == 255:
                FN = FN+1

    precision = TP/(TP+FP)
    accuracy = (TP+TN)/(TP+TN+FP+FN)
    
    precision_moy = precision_moy + precision
    accuracy_moy = accuracy_moy + accuracy

precision_moy = precision_moy / len(COLORED)
accuracy_moy = accuracy_moy / len(COLORED)

I keep on changing the intervals, testing and calculating the accuracy, in order to find the best possible interval for each color space.我不断更改间隔,测试和计算准确性,以便为每个颜色空间找到最佳间隔。 This change is done by multiplying the distance by a number between 0 and 2. For example:这种变化是通过将距离乘以 0 到 2 之间的数字来完成的。例如:

OLD R:旧 R:

  • turning point: 142转折点:142
  • distance: 61距离:61
  • interval: [81, 203]间隔:[81, 203]

NEW DISTANCE = OLD DISTANCE * 0.7 = 61 * 0.7 = 43新距离 = 旧距离 * 0.7 = 61 * 0.7 = 43

NEW R:新 R:

  • turning point: 142转折点:142
  • distance: 43距离:43
  • interval: [99, 185]间隔:[99, 185]
  • To get a higher interval I would multiply by a number in ]1, 2]为了获得更高的间隔,我将乘以]1, 2] 中的一个数字
  • To get a smaller interval I would multiply by a number in ]0, 1[为了获得更小的间隔,我将乘以 ]0, 1[ 中的一个数字

Now, to my question:现在,我的问题:

I would like to find the best possible interval for each color space using an optimization method instead of manually and randomly changing the intervals.我想使用优化方法找到每个颜色空间的最佳间隔,而不是手动和随机更改间隔。 What optimization method should I use and how would I use it?我应该使用什么优化方法以及如何使用它?

Thank you for taking the time.感谢您抽出宝贵时间。 Your help is appreciated.感谢您的帮助。

One basic approach which converges quickly but may not yield the global optimum is Hillclimbing .一种快速收敛但可能不会产生全局最优值的基本方法是爬山法。

Hillclimbing is a form of local search which can be used in this case. Hillclimbing 是一种可以在这种情况下使用的本地搜索形式。
Hillclimbing works by going from one state or solution to the next depending on the score or performance of the state.爬山的工作方式是从一个 state 或解决方案转到下一个,具体取决于 state 的分数性能 If no better state can be found that state is returned as solution.如果没有更好的 state 可以发现 state 作为解决方案返回。

There are multiple ways of implementing Hillclimbing, in your case I would do something like this:有多种实现爬山的方法,在你的情况下,我会做这样的事情:

The State : In your case an item containing the Min and Max numpy arrays and the accuracy or f-measure of the mask created with these arrays applied on the image as score property. The State : In your case an item containing the Min and Max numpy arrays and the accuracy or f-measure of the mask created with these arrays applied on the image as score property.

For now I suggest you only take symmetrical ranges to massively reduce the search space.现在我建议你只采用对称范围来大量减少搜索空间。

Starting State启动 State
You can create a starting state at random, taking a random interval for each channel (Red, Green, Blue).您可以随机创建一个起始 state,为每个通道(红色、绿色、蓝色)采用随机间隔。 This is especially useful if you run this algorithm multiple times.如果您多次运行此算法,这将特别有用。 Determine the maximum and minimum for each interval based on your histograms.根据您的直方图确定每个间隔的最大值和最小值。

Iteration Process (this is where the searching is done)迭代过程(这是完成搜索的地方)
You want to create an infinite loop in which you create successor states for the current state.您想创建一个无限循环,在其中为当前 state 创建后续状态。 Increasing or decreasing the interval of each channel with say 10 of the current state, and then every combination of those new intervals can be a successor state.使用当前 state 中的10来增加或减少每个通道的间隔,然后这些新间隔的每个组合都可以是后继 state。
Another way could be to switch channel each iteration.另一种方法可能是每次迭代切换通道。 So in the first iteration you create a successor state that has the Red channel of the current state decreased with 10, and a successor state that has the Red channel of the current state increased with 10. The second iteration you change the Green channel, the third iteration the Blue channel, etc. So in the first iteration you create a successor state that has the Red channel of the current state decreased with 10, and a successor state that has the Red channel of the current state increased with 10. The second iteration you change the Green channel, the第三次迭代蓝色通道等。

You then create a mask based on each successor state and apply them onto the image, therefore determining the performance of each successor state.然后,您基于每个后继 state 创建一个掩码并将它们应用到图像上,从而确定每个后继 state 的性能。
Select the best performing successor state and take it as current state if its performance is better. Select 是性能最好的后继产品 state,如果性能更好,则将其作为当前的 state。

Repeat this process until the best successor state is performing worse than the current state, then you know you have hit a local optimum.重复这个过程,直到最好的后继 state 的性能比当前的 state 差,然后你就知道你已经达到了局部最优。 Return this state as solution.将此 state 作为解决方案返回。

Problems问题
As highlighted in above line, this algorithm will find the local optimum for the starting state.如上一行所示,该算法将找到起始 state 的局部最优值。 This is because of greediness of this algorithm.这是因为该算法的贪心。
You therefore may want to restart this algorithm on different starting locations, allowing more of the search space to be explored, increasing the chance the global maximum is found.因此,您可能希望在不同的起始位置重新启动此算法,从而允许探索更多的搜索空间,从而增加找到全局最大值的机会。
If you have multiple threads you may run multiple instances in parallel and then finally returning the best state out of the results from each instance.如果您有多个线程,您可以并行运行多个实例,然后最终从每个实例的结果中返回最佳 state。

Hillclimbing is not the best optimization algorithm, but it is very fast and easy to implement. Hillclimbing 不是最好的优化算法,但它非常快速且易于实现。

I would suggest using genetic optimization which can be easily implemented for as simple problem as yours.我建议使用遗传优化,它可以很容易地解决像你这样简单的问题。 Since the problem is relatively "small" it should not take much longer to find optimal solution compared to some local optimization method like Hillclimb suggested by @Leander.由于问题相对“小”,与@Leander 建议的一些局部优化方法(如 Hillclimb)相比,找到最优解应该不会花费太多时间。 Genetic algorithm is a metaheuristic search so it is not guaranteed to find the optimal solution but it should get you very close.遗传算法是一种元启发式搜索,因此不能保证找到最佳解决方案,但它应该让您非常接近。 In fact for a such small problem the chance that you will find the global optimum is very high.事实上,对于这样一个小问题,您找到全局最优值的机会非常高。

As a start I would recommend taking a look at DEAP so you don't have to implement anything yourself ( https://deap.readthedocs.io/en/master/ ).作为开始,我建议您看一下 DEAP,这样您就不必自己实现任何东西( https://deap.readthedocs.io/en/master/ )。 It contains very good implementations of many genetic algorithm variations and there are tutorials with nice examples.它包含许多遗传算法变体的非常好的实现,并且有很好的示例教程。 With a bit of effort you should be able to compose a simple optimization algorithm in a day or two.通过一些努力,您应该能够在一两天内编写一个简单的优化算法。

Genetic algorithm will from now on be denoted as GA for simplicity为简单起见,从现在起遗传算法将被表示为GA

Some tips where to start:从哪里开始的一些提示:

  • I suggest you start with the simplest variation eaSimple in DEAP.我建议您从 DEAP 中最简单的变体eaSimple开始。 When this will not be satisfactory you can always move to something little more sophisticated but I think that won't be necessary.当这不能令人满意时,您总是可以转向更复杂的东西,但我认为这没有必要。
  • your Individual in GA will have 6 components -> [blue_low, blue_high, green_low, green_high, red_low, red_high] this will also address the problem of assymetric interval as mentioned by @Leander in the comments您在 GA 中的Individual将有 6 个组件 -> [blue_low, blue_high, green_low, green_high, red_low, red_high] 这也将解决@Leander 在评论中提到的不对称间隔问题
  • mutations will be done by randomly altering elements of the individual mutations将通过随机改变个体的元素来完成
  • for fittness function you can use your accuracy as you are computing it now对于健身fittness您可以使用您的准确度,因为您现在正在计算它

That is essentially all you need to build GA for your problem.这基本上就是为您的问题构建 GA 所需的全部内容。 This example here https://deap.readthedocs.io/en/master/examples/ga_onemax.html should get you up and running.这里的这个例子https://deap.readthedocs.io/en/master/examples/ga_onemax.html应该让你启动并运行。 You just need to define your own individuals, operators and fitness evaluation function as I mentioned in previous steps您只需要定义自己的个人、运营商和健身评估 function 就像我在前面的步骤中提到的那样

A final note on the use of any general optimization method.关于使用任何一般优化方法的最后说明。 As I understand this is a discrete problem in 6 dimensions since you have 6 components: blue_low, blue_high, green_low, green_high, red_low, red_high and each one of them has only 255 possible values.据我了解,这是一个 6 个维度的离散问题,因为您有 6 个组件:blue_low、blue_high、green_low、green_high、red_low、red_high,每个组件只有 255 个可能值。 This will prevent use of most optimization methods since they require the problem to be continuous.这将阻止使用大多数优化方法,因为它们要求问题是连续的。

In your current algorithm, you are finding the Mode (ie., peak) of the colorspace data and then taking the bins (color values) symmetrically around the mode.在您当前的算法中,您正在查找颜色空间数据的模式(即峰值),然后围绕模式对称地获取箱(颜色值)。

For a normal distribution curve, you would have the % of population based on the number of standard deviations around the mean as given below:对于正态分布曲线,您将根据平均值周围的标准偏差数获得总体百分比,如下所示:

正态分布曲线

In a normal distribution, mean, median and mode will be the same.在正态分布中,均值、中位数和众数将相同。 However, if your distribution is skewed the population on the left side of the mean wont be the same as the population on the right side of the mean.但是,如果您的分布有偏差,则均值左侧的总体与均值右侧的总体将不同。 So, a simple adjustment that you can make is as follows:因此,您可以进行的简单调整如下:

Let p_left be the % of population to the left of the peak and p_right be the % of population to the right of the peak.p_left为峰值左侧的人口百分比,而p_right为峰值右侧的人口百分比。 For eg: let p_left = 40% and p_right = 60% .例如:让p_left = 40%p_right = 60% Instead of a fixed interval width of 40 that you are using (-20,20) , you can set another parameter which is % of selected population , say 15%.您可以设置另一个参数,即% of selected population (例如 15%),而不是使用(-20,20)的固定间隔宽度 40。 This is the total population we want around the mode (including the mode).这是我们想要的模式周围的总人口(包括模式)。 You can then divide this 15% in the proportion of the left vs right population.然后,您可以将这 15% 划分为左右人口的比例。

left proportion = 15% x 40% = 6%
right proportion = 15% x 60% = 9%

You should correct these 6% and 9% by calculating the mode % of population and taking out half of it from each.您应该通过计算mode % of population并从中取出一半来纠正这 6% 和 9%。 For eg: If the mode is 5% of the population, you should deduct 2.5% from 6% and 9%.例如:如果众数为人口的5%,则应从6%和9%中扣除2.5%。 This gives adjusted p_left and p_right as:这给出了调整后p_leftp_right为:

p_left = 6% - 2.5% = 3.5%
p_right = 9% - 2.5% = 6.5%

Instead of dividing the interval evenly around the mean, you compute how many bins from the left and right need to be included to determine the range.您不是在平均值周围均匀地划分区间,而是计算需要包含左右多少个 bin 才能确定范围。 For eg: you may find including 5 bins on the left adds up to 3.5% of total population and adding 3 bins on the right gives you 6.5% of the population approximately.例如:您可能会发现在左侧添加 5 个垃圾箱占总人口的 3.5%,在右侧添加 3 个垃圾箱大约占人口的 6.5%。

So, your range becomes (x - 5, x + 3) where x is the x coordinate of the mode.因此,您的范围变为(x - 5, x + 3) ,其中 x 是模式的 x 坐标。

Parameter estimation: To determine the right % for the mode% of population (the 15% in the example above), you can compute the histograms on a standard set of your masked images and use that to determine a good initial estimate.参数估计:要确定人口模式百分比的正确百分比(上例中的 15%),您可以在一组标准蒙版图像上计算直方图,并使用它来确定良好的初始估计。 Essentially count the unmasked pixels in your masked images and divide it by total pixels基本上计算蒙版图像中未蒙版的像素并将其除以总像素

Actually, finding the global optimum for a given dataset is not too complicated.实际上,找到给定数据集的全局最优值并不太复杂。 For simplicity, let's first assume you have grayscale images since each of the colors is treated independently (I believe).为简单起见,我们首先假设您有灰度图像,因为每个 colors 都是独立处理的(我相信)。 It would be a bit more complicated if you were scoring a pixel based on all 3 colors falling within the required interval, but it seems like you're not.如果您根据所有 3 个 colors 对一个像素进行评分,那么这会有点复杂,但您似乎不是。

So anyways, you can just exhaustively check each interval for each image, depending on the size of your dataset.因此,无论如何,您可以根据数据集的大小,彻底检查每个图像的每个间隔。 For instance, if each pixel only takes integer values in [0,255], there are only on the order of 100 interval sizes you even need to consider.例如,如果每个像素仅采用 [0,255] 中的 integer 值,那么您甚至只需要考虑大约 100 个间隔大小。 So you can compute the accuracy for each candidate interval size and each image, and simply take the interval that yields the highest average accuracy.因此,您可以计算每个候选区间大小和每个图像的准确度,并简单地采用产生最高平均准确度的区间。 Repeat across all colors.在所有 colors 中重复。 This is the brute force approach for sure, but unless your dataset is quite large it shouldn't be computationally expensive using optimized matrix operations.这肯定是蛮力方法,但除非您的数据集非常大,否则使用优化的矩阵运算在计算上不应该是昂贵的。 If your dataset is huge, a sufficiently large random sample of images over which to use this technique would yield an approximate (though not globally optimal solution).如果您的数据集很大,使用此技术的足够大的随机图像样本将产生近似值(尽管不是全局最优解)。

As an aside, the way you are currently computing your accuracies between mask and ground truth is quite inefficient.顺便说一句,您目前计算掩码和地面实况之间的准确性的方式非常低效。 The rule of thumb is pretty much to always use numpy matrix operations when you can because they're much more efficient (there are some cool algorithmic tricks for time saving on matrix operations and they're written in C so are faster for that reason as well.经验法则几乎总是尽可能使用 numpy 矩阵运算,因为它们效率更高(有一些很酷的算法技巧可以节省矩阵运算的时间,它们是用 C 编写的,因此速度更快,因为出色地。

You can replace this:你可以替换这个:

 for i in range(mask.shape[0]) :
    for j in range(mask.shape[1]) :
        if mask[i,j] == 255 and img[i,j,0] == 255:
            TP = TP + 1
        if mask[i,j] == 0 and img[i,j,0] == 0:
            TN = TN+1
        if mask[i,j] == 255 and img[i,j,0] == 0:
            FP = FP+1
        if mask[i,j] == 0 and img[i,j,0] == 255:
            FN = FN+1

With the equivalent matrix operation:使用等价矩阵运算:

ones = np.ones(img.shape)
zeros = np.zeros(img.shape)
diff = mask - img
TP = sum(np.where(np.multiply(diff,img) == 1,ones,zeros))
TN = sum(np.where(np.multiply(diff,1-img) == 1,ones,zeros))
FP = sum(np.where(diff == -1,ones,zeros))
FN = sum(np.where(diff == 1,ones,zeros))

This will save you time especially if you use a brute-force approach like the one I suggested, but is also good practice in general这将节省您的时间,特别是如果您使用我建议的那种蛮力方法,但通常也是一种很好的做法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM