如何計算一組圖像的均值和標准差

Question

我想知道我要計算給定RGB圖像數據集的mean和std 。
例如，對於imagenet ，我們有imagenet_stats: ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225] 。
我試過了：

rgb_values = [np.mean(Image.open(img).getdata(), axis=0)/255 for img in imgs_path]
np.mean(rgb_values, axis=0)
np.std(rgb_values, axis=0)

我不確定我得到的值是否正確。
哪個可能是更好的實現？

Answer 1

兩種解決方案：

第一個解決方案迭代圖像。 它比第二種解決方案慢得多，並且它使用相同數量的 memory因為它首先加載然后將所有圖像存儲在列表中。 所以它比第二種解決方案更糟糕，除非你改變你的圖像的加載方式——從光盤上一張一張地加載和處理它們。
第二種方案需要同時保存memory中的所有圖像。 它要快得多，因為它是完全矢量化的。

第一個解決方案（迭代圖像）：

對於每個通道： R ， G ， B ，這里是如何計算所有圖像中所有像素的mean s 和std差：

要求：

每個圖像具有相同數量的像素。

如果不是這種情況 - 使用第二種解決方案（如下）。

images_rgb = [np.array(Image.open(img).getdata()) / 255. for img in imgs_path]
# Each image_rgb is of shape (n, 3), 
# where n is the number of pixels in each image,
# and 3 are the channels: R, G, B.

means = []
for image_rgb in images_rgb:
    means.append(np.mean(image_rgb, axis=0))
mu_rgb = np.mean(means, axis=0)  # mu_rgb.shape == (3,)

variances = []
for image_rgb in images_rgb:
    var = np.mean((image_rgb - mu_rgb) ** 2, axis=0)
    variances.append(var)
std_rgb = np.sqrt(np.mean(variances, axis=0))  # std_rgb.shape == (3,)

證明

...如果像這樣計算，並且如果一次使用所有像素計算， mean和std將相同：

假設每個圖像有n像素（值為vals_i ），並且有m個圖像。

然后有(n*m)像素。

所有real_mean中所有像素的vals_i為：

total_sum = sum(vals_1) + sum(vals_2) + ... + sum(vals_m)
real_mean = total_sum / (n*m)

分別添加每個圖像的平均值：

sum_of_means = sum(vals_1) / m + sum(vals_2) / m + ... + sum(vals_m) / m
             = (sum(vals_1) + sum(vals_2) + ... + sum(vals_m)) / m

現在， real_mean和sum_of_means之間的關系是什么？ - 如你看到的，

real_mean = sum_of_means / n

類似地，使用標准偏差的公式，所有real_std中所有像素的vals_i為：

sum_of_square_diffs =  sum(vals_1 - real_mean) ** 2
                     + sum(vals_2 - real_mean) ** 2
                     + ... 
                     + sum(vals_m - real_mean) ** 2
real_std = sqrt( total_sum / (n*m) )

如果從另一個角度看這個方程，你會發現real_std基本上是m個圖像中n值的平均方差的平均值。

確認

實際mean和std ：

rng = np.random.default_rng(0)
vals = rng.integers(1, 100, size=100)  # data

mu = np.mean(vals)
print(mu)
print(np.std(vals))

50.93                 # real mean
28.048976808432776    # real standard deviation

將其與逐圖像方法進行比較：

n_images = 10

means = []
for subset in np.split(vals, n_images):
    means.append(np.mean(subset))
new_mu = np.mean(means)

variances = []
for subset in np.split(vals, n_images):
    var = np.mean((subset - mu) ** 2)
    variances.append(var)

print(new_mu)
print(np.sqrt(np.mean(variances)))

50.92999999999999     # calculated mean
28.048976808432784    # calculated standard deviation

第二種解決方案（完全矢量化）：

一次使用所有圖像的所有像素。

rgb_values = np.concatenate(
    [Image.open(img).getdata() for img in imgs_path], 
    axis=0
) / 255.

# rgb_values.shape == (n, 3), 
# where n is the total number of pixels in all images, 
# and 3 are the 3 channels: R, G, B.

# Each value is in the interval [0; 1]

mu_rgb = np.mean(rgb_values, axis=0)  # mu_rgb.shape == (3,)
std_rgb = np.std(rgb_values, axis=0)  # std_rgb.shape == (3,)

如何計算一組圖像的均值和標准差

問題描述

1 個解決方案

解決方案1
0 2022-08-15 09:23:28

兩種解決方案：

第一個解決方案（迭代圖像）：

要求：

證明

確認

第二種解決方案（完全矢量化）：

如何計算一組圖像的均值和標准差

問題描述

1 個解決方案

解決方案1 0 2022-08-15 09:23:28

兩種解決方案：

第一個解決方案（迭代圖像）：

要求：

證明

確認

第二種解決方案（完全矢量化）：

解決方案1
0 2022-08-15 09:23:28