简体   繁体   English

了解fftfreq函数的输出以及图像中单行的fft图

[英]Understanding the output of fftfreq function and the fft plot for a single row in an image

I am trying to understand the function fftfreq and the resulting plot generated by adding real and imaginary components for one row in the image. 我试图了解函数fftfreq以及通过为图像中的一行添加实部和虚部分量而生成的结果图。 Here is what I did: 这是我所做的:

import numpy as np
import cv2
import matplotlib.pyplot as plt

image = cv2.imread("images/construction_150_200_background.png", 0)
image_fft = np.fft.fft(image)
real = image_fft.real
imag = image_fft.imag

real_row_bw = image_fft[np.ceil(image.shape[0]/2).astype(np.int),0:image.shape[1]]
imag_row_bw = image_fft[np.ceil(image.shape[0]/2).astype(np.int),0:image.shape[1]]

sum = real_row_bw + imag_row_bw

plt.plot(np.fft.fftfreq(image.shape[1]), sum)
plt.show()

Here is image of the plot generated : 这是生成的图的图像: 图1

I read the image from the disk, calculate the Fourier transform and extract the real and imaginary parts. 我从磁盘上读取图像,计算傅立叶变换并提取实部和虚部。 Then I sum the sine and cosine components and plot using the pyplot library. 然后,我对sinecosine分量求和,并使用pyplot库进行绘图。

Could someone please help me understand the fftfreq function? 有人可以帮我理解fftfreq函数吗? Also what does the peak represent in the plot for the following image: 另外,下图中的峰在图中代表什么: 图2

I understand that Fourier transform maps the image from spatial domain to the frequency domain but I cannot make much sense from the graph. 我了解傅立叶变换将图像从空间域映射到频域,但是从图形上我没有多大意义。

Note : I am unable to upload the images directly here, as at the moment of asking the question, I am getting an upload error. 注意 :我无法直接在此处上传图像,因为在提出问题时,我遇到了上传错误。

I don't think that you really need fftfreq to look for frequency-domain information in images, but I'll try to explain it anyway. 我认为您确实不需要fftfreq在图像中查找频域信息,但是无论如何我都会尽力进行解释。

fftfreq is used to calculate the frequencies that correspond to each bin in an FFT that you calculate. fftfreq用于计算与您计算的FFT中的每个仓对应的频率。 You are using fftfreq to define the x coordinates on your graph. 您正在使用fftfreq在图形上定义x坐标。

fftfreq has two arguments: one mandatory, one optional. fftfreq有两个参数:一个是必需的,一个是可选的。 The mandatory first argument is an integer, the window length you used to calculate an FFT. 强制性的第一个参数是整数,即您用于计算FFT的窗口长度。 You will have the same number of frequency bins in the FFT as you had samples in the window. 您在FFT中将拥有与窗口中相同数量的频点数目。 The optional second argument is the time period per window. 可选的第二个参数是每个窗口的时间段。 If you don't specify it, the default is a period of 1. I don't know whether a sample rate is a meaningful quantity for an image, so I can understand you not specifying it. 如果您未指定,则默认值为1。我不知道采样率是否对图像有意义,因此我可以理解您未指定它。 Maybe you want to give the period in pixels? 也许您想以像素为单位指定周期? It's up to you. 由你决定。

Your FFT's frequency bins start at the negative Nyquist frequency, which is half the sample rate (default = -0.5), or a little higher; FFT的频率仓开始于负奈奎斯特频率,该频率是采样率的一半(默认= -0.5)或更高。 and it ends at the positive Nyquist frequency (+0.5), or a little lower. 并以正的奈奎斯特频率(+0.5)或更低的频率结束。

The fftfreq function returns the frequencies in a funny order though. 尽管fftfreq函数以有趣的顺序返回频率。 The zero frequency is always the zeroth element. 零频率始终是第零个元素。 The frequencies count up to the maximum positive frequency, and then flip to the maximum negative frequency and count upwards towards zero. 频率计数到最大正频率,然后翻转到最大负频率并向上计数到零。 The reason for this strange ordering is that if you're doing FFT's with real-valued data (you are, image pixels do not have complex values), the negative frequency data is exactly equal to the corresponding positive frequency data and is redundant. 这种奇怪排序的原因是,如果您要对实值数据执行FFT(实际上,图像像素没有复数值),则负频率数据与相应的正频率数据完全相同,并且是多余的。 This ordering makes it easy to throw the negative frequencies away: just take the first half of the array. 这种排序方式可以很容易地消除负频率:只需取阵列的前半部分即可。 Since you aren't doing that, you're plotting the negative frequencies too. 由于您没有这样做,因此您也在绘制负频率。 If you should choose to ignore the second half of the array, the negative frequencies will be removed. 如果您选择忽略阵列的后半部分,则负频率将被消除。

As for the strong spike that you see at the zero frequency in your image, this is probably because your image data is RGB values which range from 0 to 255. There's a huge "DC offset" in your data. 至于您在图像的零频率处看到的强烈尖峰,这可能是因为图像数据是RGB值,范围从0到255。数据中存在很大的“ DC偏移”。 It looks like you're using Matplotlib. 看来您正在使用Matplotlib。 If you are plotting in an interactive window, you can use the zoom rectangle to look at that horizontal line. 如果要在交互式窗口中绘图,则可以使用缩放矩形查看该水平线。 If you push the DC offset off scale, setting the Y axis scale to perhaps ±500, I bet you will start to see that the horizontal line isn't exactly horizontal after all. 如果您将DC偏移推到刻度之外,将Y轴刻度设置为±500,我敢打赌,您将开始看到水平线毕竟不是完全水平。

Once you know which bin contains your DC offset, if you don't want to see it, you can just assign the value of the fft in that bin to zero. 一旦知道哪个档位包含DC偏移量,如果不想看到它,就可以将该档位中的fft的值赋为零。 Then the graph will scale automatically. 然后,图形将自动缩放。

By the way, these two lines of code perform identical calculations, so you aren't actually taking the sine and cosine components like your text says: 顺便说一下,这两行代码执行相同的计算,因此您实际上并没有像文本中所说的那样使用正弦和余弦分量:

real_row_bw = image_fft[np.ceil(image.shape[0]/2).astype(np.int),0:image.shape[1]]
imag_row_bw = image_fft[np.ceil(image.shape[0]/2).astype(np.int),0:image.shape[1]]

And one last thing: to sum the sine and cosine components properly (once you have them), since they're at right angles, you need to use a vector sum rather than a scalar sum. 最后一件事:正确地对正弦和余弦分量求和(一旦有了),由于它们成直角,因此需要使用向量和而不是标量和。 Look at the function numpy.linalg.norm . 查看函数numpy.linalg.norm

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM