简体   繁体   中英

SSIM for 3D image volume

I'm working on an image super-resolution problem (both 2D and 3D) using TensorFlow and am using SSIM as one of the eval_metrics .

I'm using image.ssim from TF and measure.comapre_ssim from skimage . Both of them are giving same results for 2D, but there's always a difference in results for 3D volumes.

I've looked into the source code for both TF-implementation and skimage-implemenation . There seems to be some fundamental differences in how the input images are considered and handled in the two implementations.

Code to replicate the issue:

import numpy as np
import tensorflow as tf

from skimage import measure

# For 2-D case
np.random.seed(12345)
a = np.random.random([32, 32, 64])
b = np.random.random([32, 32, 64])

a_ = tf.convert_to_tensor(a)
b_ = tf.convert_to_tensor(b)

ssim_2d_tf = tf.image.ssim(a_, b_, 1.0)
ssim_2d_sk = measure.compare_ssim(a, b, multichannel=True, gaussian_weights=True, data_range=1.0, use_sample_covariance=False)

print (tf.Session().run(ssim_2d_tf), ssim_2d_sk)

# For 3-D case
np.random.seed(12345)
a = np.random.random([32, 32, 32, 64])
b = np.random.random([32, 32, 32, 64])

a_ = tf.convert_to_tensor(a)
b_ = tf.convert_to_tensor(b)

ssim_3d_tf = tf.image.ssim(a_, b_, 1.0)
ssim_3d_sk = measure.compare_ssim(a, b, multichannel=True, gaussian_weights=True, data_range=1.0, use_sample_covariance=False)

s_3d_tf = tf.Session().run(ssim_3d_tf)
print (np.mean(s_3d_tf), ssim_3d_sk)

I have to take the mean of the output in case of 3D, as Tensorflow computes SSIM over last three dimensions, and hence results in 32 SSIM values. This suggests that TF considers images for SSIM in NHWC format. Is this good for SSIM over 3D volumes?

skimage however, seems to be using 1D Gaussian filters. So clearly even this is not considering depth in 3D volumes.

Can someone throw some light on these and help me in deciding which one to use further and why?

From a cursory look at the code, it seems that TensorFlow always computes a 2D SSIM, for each image in the batch and for each channel. It averages SSIM values across channels, and returns a value for each image in the batch. For TF, a 4D array is a collection of 2D images with multiple channels.

In contrast, SciKit-Image computes SSIM over all dimensions, except the last one if multichannel is set. So in the case of a 4D array, it computes a 3D SSIM for each channel and averages across channels.

This is consistent with your finding of similar results for a 3D array, but different results for a 4D array.


skimage however, seems to be using 1D Gaussian filters.

I'm not sure where you got this from, SciKit-Image uses an n D Gaussian in the case of a n D image. However, a Gaussian is a separable filter, meaning it can be efficiently implemented by n applications of a 1D filter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM