简体   繁体   English

计算批次的均值和标准差[Python / Numpy]

[英]Calculating Mean & STD for Batch [Python/Numpy]

Looking to calculate Mean and STD per channel over a batch efficiently. 希望有效地计算批次中每个通道的均值和标准差。


Details: 细节:

  • batch size: 128 批次大小:128
  • images: 32x32 图片:32x32
  • 3 channels (RGB) 3通道(RGB)

So each batch is of size [128, 32, 32, 3]. 因此,每批的大小为[128、32、32、3]。

There are lots of batches (naive method takes ~4min over all batches). 批次很多(所有批次的初始方法约需4分钟)。

And I would like to output 2 arrays: (meanR, meanG, meanB) and (stdR, stdG, stdB) 我想输出2个数组:(meanR,meanG,meanB)和(stdR,stdG,stdB)


(Also if there is an efficient way to perform arithmetic operations on the batches after calculating this, then that would be helpful. For example, subtracting the mean of the whole dataset from each image) (此外,如果有一种有效的方法可以在计算完这些后对批次执行算术运算,那将很有帮助。例如,从每个图像中减去整个数据集的均值)

If I understood you correctly and you want to calculate mean and std values for all images: 如果我对您的理解正确,并且想要计算所有图像的均值和标准差值:

Demo: 2 images of (2,2,3) shape each (for the sake of simplicity): 演示:2张(2,2,3)形状的图像(为简单起见):

In [189]: a
Out[189]:
array([[[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]],


       [[[13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24]]]])

In [190]: a.shape
Out[190]: (2, 2, 2, 3)

In [191]: np.mean(a, axis=(0,1,2))
Out[191]: array([ 11.5,  12.5,  13.5])

In [192]: np.einsum('ijkl->l', a)/float(np.prod(a.shape[:3]))
Out[192]: array([ 11.5,  12.5,  13.5])

Speed measurements: 速度测量:

In [202]: a = np.random.randint(255, size=(128,32,32,3))

In [203]: %timeit np.mean(a, axis=(0,1,2))
9.48 ms ± 822 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [204]: %timeit np.einsum('ijkl->l', a)/float(np.prod(a.shape[:3]))
1.82 ms ± 22.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Assume you want to get the mean of multiple axis(if I didn't get you wrong). 假设您想获得多轴均值(如果我没有弄错的话)。 numpy.mean(a, axis=None) already supports multiple axis mean if axis is a tuple . 如果axis是tuple numpy.mean(a, axis=None)已经支持多轴均值。

I'm not so sure what you mean by naive method. 我不太确定您的天真的意思是什么。

You can use this method to calc the mean and std of R, G, B. 您可以使用此方法计算R,G,B的均值和标准差。

a = np.random.rand(128,32,32,3)
for i in range(3):
    means = [m for m in np.mean(a, axis = (3, i))]
for i in range(3):
    stds = [s for s in np.std(a, axis = (3, i))]

while axis=(3,i) 3 represents the channels, and i represents the colors(R, G, B). axis=(3,i) 3代表通道,而i代表颜色(R,G,B)。 Also you can reference this link. 您也可以参考此链接。 Get mean of 2D slice of a 3D array in numpy . 在numpy中获取3D数组的2D切片的均值 And I hope this can help you. 我希望这可以为您提供帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM