简体   繁体   English

使用快速傅里叶变换获得正确的频率

[英]Getting correct frequencies using a fast Fourier transform

I am trying to take understand the frequencies of a dataset and am having issues in getting the fast Fourier transform to work.我正在尝试了解数据集的频率,并且在使快速傅立叶变换起作用时遇到问题。 The main problem is that I cannot figure out how to get the correct frequencies on the x-axis.主要问题是我无法弄清楚如何在 x 轴上获得正确的频率。

Background背景

I have a dataset with many columns but the columns of interest are TOF(time of flight) and dE/dx.我有一个包含许多列的数据集,但感兴趣的列是 TOF(飞行时间)和 dE/dx。 I have attached the CSV file containing the data.我附上了包含数据的 CSV 文件。 Here is how I access it.这是我访问它的方式。

import pandas as pd
file = 'np_15us.csv'
dataset = pd.read_csv(file,skiprows=8)
df = dataset[:-1] #necesssary because last row of the dataset is null for some reason
x = df['TOF'] #TOF is in micro-seconds
y= df['dE/dx']

Now, when you plot x vs. y it's roughly a sinusoid.现在,当您 plot x vs. y时,它大致是正弦曲线。 I can eyeball the frequency to be roughly 116kHz.我可以观察到大约 116kHz 的频率。 I want to get the exact frequencies by using a fast Fourier transform because I expect other datasets to be imperfect sinusoids.我想通过使用快速傅立叶变换来获得准确的频率,因为我希望其他数据集是不完美的正弦曲线。

dE/dx 与 tof

Problem问题

When I try to take the fft of the data set using this code:当我尝试使用此代码获取数据集的fft时:

import numpy as np
x_new = np.arange(0,14, dt)
y_new = func_1(x_new)
fs = len(y_new)

fig = plt.figure(2)
plt.subplot(2,1,1)
plt.plot(x_new, y_new)
plt.xlabel('time (usec)')
plt.ylabel('E (V/mm)')
plt.subplot(2,1,2)
fft = np.fft.fft(y)/len(y)
fft = fft[range(int(len(y)/2))]
tpCount = len(fft)
values = np.arange(int(tpCount))
timePeriod = tpCount/samp
frequencies = (values/(2*timePeriod))*10**6 #followed some tutorial to get here
plt.plot(frequencies[:100], abs(fft)[:100]) #zooming in to one of the peaks 
fig.tight_layout()
plt.show()

I get,我明白了,

频率

This has a frequency of roughly 260kHz which is an overestimate.这有一个大约 260kHz 的频率,这是一个高估。 I can run the same script using np.fft.fftfreq to get我可以使用np.fft.fftfreq运行相同的脚本来获取

func_1 = interpolate.interp1d(x, y)
samp = 100
dt = 1/samp
x_new = np.arange(0,14, dt)
y_new = func_1(x_new)
fs = len(y_new)
fig = plt.figure(2)
plt.subplot(2,1,1)
plt.plot(x_new, y_new)
plt.xlabel('time (usec)')
plt.ylabel('E (V/mm)')
plt.subplot(2,1,2)
fft = np.fft.fft(y)/len(y)
freqs = np.fft.fftfreq(len(fft),dt)
fft_shift = np.fft.fftshift(fft)
freqs = np.fft.fftshift(freqs)
plt.plot(freqs[int(len(freqs)/2):int(len(freqs))-300], abs(fft_shift[int(len(fft_shift)/2):int(len(freqs))-300]) #now I don't understand the frequnecies
fig.tight_layout()
plt.show()

在此处输入图像描述

Whichever way I use, I get incorrect frequencies.无论我使用哪种方式,我都会得到不正确的频率。 So, I am definitely doing something wrong.所以,我肯定做错了什么。 I don't really understand how the frequencies are calculated using np.fft.fftfreq.我不太明白如何使用 np.fft.fftfreq 计算频率。

I don't exactly know the sampling frequency of the dataset which is why I am interpolating to get more control over that.我不完全知道数据集的采样频率,这就是为什么我要进行插值以更好地控制它。 I am not sure if I am supposed to be doing that.我不确定我是否应该这样做。 I would like to use np.fft.fftfreq since the code for that seems cleaner.我想使用np.fft.fftfreq因为它的代码看起来更干净。

Thank you for your help.谢谢您的帮助。 Please let me know if you have any questions.请让我知道,如果你有任何问题。

Please note - in this case dE/dx in the dataset should actually be dV/dx like shown in the first plot.请注意 - 在这种情况下,数据集中的 dE/dx 实际上应该是 dV/dx,如第一个 plot 所示。 The CSV file just has is named incorrectly.刚刚的 CSV 文件命名不正确。

link to CSV file - https://drive.google.com/file/d/1LNcue82K2y4ZgKr8cPIgp7VCC2vFk_J9/view?usp=sharing链接到 CSV 文件 - https://drive.google.com/file/d/1LNcue82K2y4ZgKr8cPIgp7VCC2vFk_J9/view?usp=sharing

If your data is very close to a sinusoid like this, and you generally have at least one period I think you will get a much better estimate in the time domain.如果您的数据非常接近这样的正弦曲线,并且您通常至少有一个周期,我认为您将在时域中获得更好的估计。 FFTs have energy in bins and it just picking the top bin will not give you the best result. FFT 在 bin 中有能量,仅选择顶部的 bin 不会给你最好的结果。

I would measure the distance between zero crossings of the dE/dx value.我将测量 dE/dx 值的零交叉点之间的距离。 (Do a linear interpolation to get a more exact zero crossing location). (进行线性插值以获得更精确的过零位置)。 Without even doing the interpolation I got a value of 117.9kHz.甚至没有进行插值,我得到了 117.9kHz 的值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM