简体   繁体   English

如何对图像张量进行装箱,以便将每个像素值装箱/存储为张量流中10个值中的1个

[英]How to bin an image tensor so that each pixel value is binned/bucketed into 1 of 10 values in tensorflow

I have a dataset of pictures as tensors with each pixel having a value between 0 and 1, and I have a set of "bins." 我有一个图片数据集,作为张量,每个像素的值在0到1之间,并且有一组“ bin”。

bins = [0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95]

I want to return a tensor with each pixel value being its nearest bin. 我想返回一个张量,每个像素值是其最近的bin。 As in, if a pixel is 0.03 it will turn into 0.05, if a pixel is 0.79 it will turn into 0.75. 如图所示,如果一个像素为0.03,它将变成0.05;如果一个像素为0.79,它将变成0.75。

I want this to be done with tensors not numpy. 我希望使用张量而不是numpy完成此操作。

Here is it working in numpy... tensor flow however seems to be a whole different beast when it comes to iterating. 这是在numpy中工作吗...但是,在进行迭代时,张量流似乎是完全不同的野兽。 I have tried tf.map_fn and tf.scan to iterate through but I couldn't get it to work. 我已经尝试过tf.map_fn和tf.scan进行迭代,但是无法正常工作。

def valueQuant(picture, splitSize):
  #This is the Picture that will be returned
  Quant_Pic = np.zeros((picture.shape[0], picture.shape[1]))

  #go through each pixel of the image
  for y_col in range(picture.shape[0]):  
    for x_row in range(picture.shape[1]):
      #isolate regions based on value
      for i in range(splitSize):
        #low and high values to isolate
        lowFloatRange = float((1/splitSize)*i)
        highFloatRange = float((1/splitSize)*(i+1))
        #value to turn entire clustor
        midRange = lowFloatRange + ((highFloatRange - lowFloatRange)/2)
        #current value of current pixel
        curVal = picture[y_col][x_row]
        #if the current value is within the range of interest
        if(curVal >= lowFloatRange and curVal <= highFloatRange):
            Quant_Pic[y_col][x_row] = midRange

  return Quant_Pic  

I was able to figure out an element wise method using only tensor flow methods. 我能够仅使用张量流方法来找出元素明智的方法。

def quant_val(current_input):
    bins = tf.constant([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95])
    dist = tf.tile(current_input, [10])
    dist = tf.math.subtract(bins, current_input)
    absDist = tf.math.abs(dist)
    idx = tf.math.argmin(absDist)
    output = bins[idx]
    output = tf.expand_dims(output, 0)
    print("output", output)

    return output

current_input = tf.constant([0.53])
quant_val(current_input)

This is able to return the right answer for a tensor with a single value, but I am unsure how to extrapolate this to the larger image tensor structure. 这能够为具有单个值的张量返回正确的答案,但是我不确定如何将其外推到更大的图像张量结构。 Any help would be much appreciated!!! 任何帮助将非常感激!!! Thank you oh kind wise ones. 谢谢你,好心人。

Round approach: 圆形方法:

This is very simple and easy, but some .5 values are round up, others down. 这是非常简单和容易的,但是一些.5值向上取整,而其他值向下取整。 If this is not a problem: 如果这不是问题:

def quant_val(images): #0 to 1

    images = (images - 0.05) * 10            #-0.5 to 9.5
    bins = tf.round(images)                  #0 to 9
    bins = tf.clip_by_value(bins, 0, 9)      #possible -1 and 10 due to the remark on top
    return (bins/10) + 0.05                  #0.05 to 0.95

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM