简体   繁体   English

使用Python在图像上进行FFT

[英]FFT on image with Python

I have a problem with FFT implementation in Python. 我在Python中使用FFT实现了一个问题。 I have completely strange results. 我有完全奇怪的结果。 Ok so, I want to open image, get value of every pixel in RGB, then I need to use fft on it, and convert to image again. 好吧,我想打开图像,获取RGB中每个像素的值,然后我需要在它上面使用fft,然后再次转换为图像。

My steps: 我的步骤:

1) I'm opening image with PIL library in Python like this 1)我正在使用Python中的PIL库打开图像

from PIL import Image
im = Image.open("test.png")

2) I'm getting pixels 2)我得到了像素

pixels = list(im.getdata())

3) I'm seperate every pixel to r,g,b values 3)我将每个像素分成r,g,b值

for x in range(width):
    for y in range(height):
        r,g,b = pixels[x*width+y]
        red[x][y] = r
        green[x][y] = g
        blue[x][y] = b

4). 4)。 Let's assume that I have one pixel (111,111,111). 我们假设我有一个像素(111,111,111)。 And use fft on all red values like this 并在所有红色值上使用fft

red = np.fft.fft(red)

And then: 然后:

print (red[0][0], green[0][0], blue[0][0])

My output is: 我的输出是:

(53866+0j) 111 111

It's completely wrong I think. 我认为这是完全错误的。 My image is 64x64, and FFT from gimp is completely different. 我的图像是64x64,而来自gimp的FFT完全不同。 Actually, my FFT give me only arrays with huge values, thats why my output image is black. 实际上,我的FFT只给出了具有巨大值的数组,这就是为什么我的输出图像是黑色的。

Do you have any idea where is problem? 你知道问题出在哪里吗?

[EDIT] [编辑]

I've changed as suggested to 我按照建议改变了

red= np.fft.fft2(red)

And after that I scale it 然后我扩展它

scale = 1/(width*height)
red= abs(red* scale)

And still, I'm getting only black image. 而且,我只得到黑色图像。

[EDIT2] [EDIT2]

Ok, so lets take one image. 好的,让我们拍一张照片。 test.png

Assume that I dont want to open it and save as greyscale image. 假设我不想打开它并保存为灰度图像。 So I'm doing like this. 所以我这样做。

def getGray(pixel):
    r,g,b = pixel
    return (r+g+b)/3  

im = Image.open("test.png")
im.load()

pixels = list(im.getdata())
width, height = im.size
for x in range(width):
    for y in range(height):
        greyscale[x][y] = getGray(pixels[x*width+y])  

data = []
for x in range(width):
     for y in range(height):
         pix = greyscale[x][y]
         data.append(pix)

img = Image.new("L", (width,height), "white")
img.putdata(data)
img.save('out.png')

After this, I'm getting this image 在此之后,我得到了这张照片 灰阶 , which is ok. ,没关系。 So now, I want to make fft on my image before I'll save it to new one, so I'm doing like this 所以现在,我想在我将它保存到新图像之前对我的图像进行fft,所以我就是这样做的

scale = 1/(width*height)
greyscale = np.fft.fft2(greyscale)
greyscale = abs(greyscale * scale)

after loading it. 加载后。 After saving it to file, I have 保存到文件后,我有 坏FFT . So lets try now open test.png with gimp and use FFT filter plugin. 所以让我们现在尝试使用gimp打开test.png并使用FFT过滤器插件。 I'm getting this image, which is correct 我收到这张图片,这是正确的 好的FFT

How I can handle it? 我该怎么办呢?

Great question. 好问题。 I've never heard of it but the Gimp Fourier plugin seems really neat: 我从来没有听说过它,但Gimp Fourier插件似乎很整洁:

A simple plug-in to do fourier transform on you image. 一个简单的插件,可以对您的图像进行傅立叶变换。 The major advantage of this plugin is to be able to work with the transformed image inside GIMP. 这个插件的主要优点是能够在GIMP中使用转换后的图像。 You can so draw or apply filters in fourier space, and get the modified image with an inverse FFT. 您可以在傅立叶空间中绘制或应用滤波器,并使用逆FFT获得修改后的图像。

This idea—of doing Gimp-style manipulation on frequency-domain data and transforming back to an image—is very cool! 这个想法 - 对频域数据进行Gimp风格的操作并转换回图像 - 非常酷! Despite years of working with FFTs, I've never thought about doing this. 尽管多年来使用FFT,但我从未考虑过这样做。 Instead of messing with Gimp plugins and C executables and ugliness, let's do this in Python! 不要乱用Gimp插件和C可执行文件和丑陋,让我们用Python做到这一点!

Caveat. 警告。 I experimented with a number of ways to do this, attempting to get something close to the output Gimp Fourier image (gray with moiré pattern) from the original input image, but I simply couldn't. 我尝试了很多方法来做到这一点,试图从原始输入图像中获得接近输出Gimp Fourier图像(带有莫尔图案的灰色)的东西,但我根本不能。 The Gimp image appears to be somewhat symmetric around the middle of the image, but it's not flipped vertically or horizontally, nor is it transpose-symmetric. Gimp图像在图像中间看起来有些对称,但它不是垂直或水平翻转,也不是转置对称的。 I'd expect the plugin to be using a real 2D FFT to transform an H×W image into a H×W array of real-valued data in the frequency domain, in which case there would be no symmetry (it's just the to-complex FFT that's conjugate-symmetric for real-valued inputs like images). 我希望插件能够使用真正的2D FFT将H×W图像转换为频域中的实数值数据的H×W阵列,在这种情况下,不存在对称性(它只是 - 对于像图像这样的实值输入,它是共轭对称的复数FFT。 So I gave up trying to reverse-engineer what the Gimp plugin is doing and looked at how I'd do this from scratch. 所以我放弃了尝试逆向工程Gimp插件正在做什么,看看我是如何从头做的。

The code. 编码。 Very simple: read an image, apply scipy.fftpack.rfft in the leading two dimensions to get the “frequency-image”, rescale to 0–255, and save. 非常简单:读取图像, scipy.fftpack.rfft两个维度中应用scipy.fftpack.rfft以获得“频率图像”,重新缩放到0-255,然后保存。

Note how this is different from the other answers! 请注意这与其他答案有何不同! No grayscaling —the 2D real-to-real FFT happens independently on all three channels. 没有灰度 - 2D实际到FFT在所有三个通道上独立发生。 No abs needed : the frequency-domain image can legitimately have negative values, and if you make them positive, you can't recover your original image. 不需要abs :频域图像可以合法地具有负值,如果使它们为正,则无法恢复原始图像。 (Also a nice feature: no compromises on image size . The size of the array remains the same before and after the FFT, whether the width/height is even or odd.) (也是一个很好的功能: 图像尺寸没有妥协 。无论宽度/高度是偶数还是奇数,数组的大小在FFT之前和之后都保持不变。)

from PIL import Image
import numpy as np
import scipy.fftpack as fp

## Functions to go from image to frequency-image and back
im2freq = lambda data: fp.rfft(fp.rfft(data, axis=0),
                               axis=1)
freq2im = lambda f: fp.irfft(fp.irfft(f, axis=1),
                             axis=0)

## Read in data file and transform
data = np.array(Image.open('test.png'))

freq = im2freq(data)
back = freq2im(freq)
# Make sure the forward and backward transforms work!
assert(np.allclose(data, back))

## Helper functions to rescale a frequency-image to [0, 255] and save
remmax = lambda x: x/x.max()
remmin = lambda x: x - np.amin(x, axis=(0,1), keepdims=True)
touint8 = lambda x: (remmax(remmin(x))*(256-1e-4)).astype(int)

def arr2im(data, fname):
    out = Image.new('RGB', data.shape[1::-1])
    out.putdata(map(tuple, data.reshape(-1, 3)))
    out.save(fname)

arr2im(touint8(freq), 'freq.png')

( Aside: FFT-lover geek note. Look at the documentation for rfft for details, but I used Scipy's FFTPACK module because its rfft interleaves real and imaginary components of a single pixel as two adjacent real values, guaranteeing that the output for any-sized 2D image (even vs odd, width vs height) will be preserved. This is in contrast to Numpy's numpy.fft.rfft2 which, because it returns complex data of size width/2+1 by height/2+1 , forces you to deal with one extra row/column and deal with deinterleaving complex-to-real yourself. Who needs that hassle for this application.) rfft :FFT-lover geek note。有关详细信息,请查看rfft的文档,但我使用了Scipy的FFTPACK模块,因为它的rfft将单个像素的实部和虚部交错为两个相邻的实数值,保证了任意大小的输出将保留2D图像(偶数与奇数,宽度与高度)。这与Numpy的numpy.fft.rfft2形成对比,因为它返回大小为width/2+1 height/2+1复杂数据,迫使您处理一个额外的行/列并处理自己复杂到真实的解交织。谁需要为这个应用程序麻烦。)

Results. 结果。 Given input named test.png : 给定名为test.png输入:

测试输入

this snippet produces the following output (global min/max have been rescaled and quantized to 0-255): 此代码段产生以下输出(全局最小值/最大值已重新调整并量化为0-255):

测试输出,频域

And upscaled: 并升级:

频率,升级

In this frequency-image, the DC (0 Hz frequency) component is in the top-left, and frequencies move higher as you go right and down. 在此频率图像中,DC(0 Hz频率)分量位于左上角,频率随着向右和向下移动而变高。

Now, let's see what happens when you manipulate this image in a couple of ways. 现在,让我们看看当您以几种方式操作此图像时会发生什么。 Instead of this test image, let's use a cat photo . 而不是这个测试图像,让我们使用猫照片

原始的猫

I made a few mask images in Gimp that I then load into Python and multiply the frequency-image with to see what effect the mask has on the image. 我在Gimp中制作了一些掩模图像,然后我加载到Python中并将频率图像相乘以查看蒙版对图像的影响。

Here's the code: 这是代码:

# Make frequency-image of cat photo
freq = im2freq(np.array(Image.open('cat.jpg')))

# Load three frequency-domain masks (DSP "filters")
bpfMask = np.array(Image.open('cat-mask-bpfcorner.png')).astype(float) / 255
hpfMask = np.array(Image.open('cat-mask-hpfcorner.png')).astype(float) / 255
lpfMask = np.array(Image.open('cat-mask-corner.png')).astype(float) / 255

# Apply each filter and save the output
arr2im(touint8(freq2im(freq * bpfMask)), 'cat-bpf.png')
arr2im(touint8(freq2im(freq * hpfMask)), 'cat-hpf.png')
arr2im(touint8(freq2im(freq * lpfMask)), 'cat-lpf.png')

Here's a low-pass filter mask on the left, and on the right, the result—click to see the full-res image: 这是左侧的低通滤镜掩码,右侧是结果单击以查看完整分辨率图像:

低过的猫

In the mask, black = 0.0, white = 1.0. 在掩模中,黑色= 0.0,白色= 1.0。 So the lowest frequencies are kept here (white), while the high ones are blocked (black). 所以最低频率保持在这里(白色),而高频率被阻挡(黑色)。 This blurs the image by attenuating high frequencies. 这通过衰减高频来模糊图像。 Low-pass filters are used all over the place, including when decimating (“downsampling”) an image (though they will be shaped much more carefully than me drawing in Gimp 😜). 低通滤波器遍布整个地方,包括对图像进行抽取(“下采样”)时(虽然它们的形状比我在Gimp中绘制的要小得多)。

Here's a band-pass filter , where the lowest frequencies (see that bit of white in the top-left corner?) and high frequencies are kept, but the middling-frequencies are blocked. 这是一个带通滤波器 ,其中保留了最低频率(见左上角的白色位?)和高频,但中频频率被阻挡。 Quite bizarre! 相当奇怪!

带通猫

Here's a high-pass filter , where the top-left corner that was left white in the above mask is blacked out: 这是一个高通滤波器 ,上面掩模中左上角的左上角被涂黑:

高通滤波器

This is how edge-detection works. 这就是边缘检测的工作原理。

Postscript. 后记。 Someone, make a webapp using this technique that lets you draw masks and apply them to an image real-time!!! 有人,使用这种技术制作一个webapp,让你绘制蒙版并将它们实时应用到图像中!

There are several issues here. 这里有几个问题。

1) Manual conversion to grayscale isn't good. 1)手动转换为灰度并不好。 Use Image.open("test.png").convert('L') 使用Image.open("test.png").convert('L')

2) Most likely there is an issue with types. 2)很可能存在类型问题。 You shouldn't pass np.ndarray from fft2 to a PIL image without being sure their types are compatible. 您不应该将np.ndarrayfft2传递到PIL图像,而不确定它们的类型是否兼容。 abs(np.fft.fft2(something)) will return you an array of type np.float32 or something like this, whereas PIL image is going to receive something like an array of type np.uint8 . abs(np.fft.fft2(something))将返回一个类型为np.float32的数组或类似的数组,而PIL图像将接收类似np.uint8类型的数组。

3) Scaling suggested in the comments looks wrong. 3)评论中建议的缩放看起来不对。 You actually need your values to fit into 0..255 range. 实际上你需要你的值适合0..255范围。

Here's my code that addresses these 3 points: 这是我的代码,解决了这三点:

import numpy as np
from PIL import Image

def fft(channel):
    fft = np.fft.fft2(channel)
    fft *= 255.0 / fft.max()  # proper scaling into 0..255 range
    return np.absolute(fft)

input_image = Image.open("test.png")
channels = input_image.split()  # splits an image into R, G, B channels
result_array = np.zeros_like(input_image)  # make sure data types, 
# sizes and numbers of channels of input and output numpy arrays are the save

if len(channels) > 1:  # grayscale images have only one channel
    for i, channel in enumerate(channels):
        result_array[..., i] = fft(channel)
else:
    result_array[...] = fft(channels[0])

result_image = Image.fromarray(result_array)
result_image.save('out.png')

I must admit I haven't managed to get results identical to the GIMP FFT plugin. 我必须承认,我没有设法获得与GIMP FFT插件相同的结果。 As far as I see it does some post-processing. 据我所知,它做了一些后期处理。 My results are all kinda very low contrast mess, and GIMP seems to overcome this by tuning contrast and scaling down non-informative channels (in your case all chanels except Red are just empty). 我的结果都是非常低对比度的混乱,GIMP似乎通过调整对比度和缩小非信息渠道来克服这个问题(在你的情况下,除Red之外的所有chanel都是空的)。 Refer to the image: 参考图片:

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM