简体   繁体   English

FFT 是如何工作的? Rust 中每一帧的频谱似乎都看不到

[英]How does FFT work? Can't seem to the frequency spectrum for each frame in Rust

I am only getting 735 frequencies of audio data each 0.03 seconds.我每 0.03 秒只能获得 735 个频率的音频数据。

I am trying to figure how how I can get the frequency spectrum in each frame.我想弄清楚如何获得每一帧中的频谱。 The code below only returns 735 of data each frame (because samples_in_frame is 735, that's how many samples in each frame), but I want the whole ~20,000hz for 0.03 seconds of samples.下面的代码每帧仅返回 735 个数据(因为samples_in_frame是 735,这是每帧中的样本数),但我想要整个 ~20,000hz 用于 0.03 秒的样本。 How would I go about doing this?我将如何 go 这样做?

path : path to the wave file path :波形文件的路径

second_start : start point to process in seconds, 0 second_start : 以秒为单位处理的起点,0

second_length : how long to process in seconds, 10 second_length :以秒为单位处理多长时间,10

frame_rate : how many frames in each second, 60 frame_rate : 每秒多少帧,60

fn fft_this(mut buffer: Vec<Complex<f64>>, samples_in_frame: usize) -> Vec<f64> {
    let mut planner = FftPlanner::new();

    let fft = planner.plan_fft_forward(samples_in_frame);

    fft.process(&mut buffer[..]);

    let mut frame_data: Vec<f64> = Vec::with_capacity(samples_in_frame);

    for i in 0..samples_in_frame {
        frame_data.push(buffer[i].norm())
    }

    return frame_data;
}

#[tauri::command]
pub fn analyze(
    path: &str,
    second_start: f64,
    second_length: f64,
    frame_rate: f64,
) -> Vec<Vec<f64>> {
    let wave_file: Wave64 = Wave64::load(path).expect("Could not load wave.");

    let samples_start: f64 = second_start * wave_file.sample_rate();
    let samples_in_each_frame: usize = (wave_file.sample_rate() / frame_rate) as usize;

    let frames_in_length: usize = (second_length * frame_rate) as usize;

    let mut wave_buffer: Vec<Vec<f64>> = vec![];

    for frame_index in 0..frames_in_length {
        let mut frame_buffer: Vec<Complex<f64>> = vec![];

        for sample_index in 0..samples_in_each_frame {
            let buffer_index: usize = frame_index * sample_index + samples_start as usize;

            frame_buffer.push(Complex {
                re: wave_file.at(0, buffer_index),
                im: 0.0,
            });
        }

        let processed_frame: Vec<f64> = fft_this(frame_buffer, samples_in_each_frame);

        wave_buffer.push(processed_frame);
    }

    return wave_buffer;
}

I think you need more background information about FFTs in general.我认为您通常需要更多有关 FFT 的背景信息。

If you have 735 data points, this data only consists of 735 orthogonal frequencies.如果您有 735 个数据点,则此数据仅包含 735 个正交频率。

Lets assume those 735 points represent 1 second.让我们假设这 735 个点代表 1 秒。 Then:然后:

  • The first FFT value is the DC part, 0 Hz , and the average of all values.第一个 FFT 值是直流部分、 0 Hz和所有值的平均值。
  • The second one is 1 Hz , the slowest contained frequency, meaning one full cycle during the sampling period.第二个是1 Hz ,最慢的包含频率,意味着采样周期内的一个完整周期。
  • The next one is 2 Hz , meaning two full cycles during the sampling period.下一个是2 Hz ,这意味着采样期间有两个完整的周期。
  • ... ...
  • 367 Hz , meaning 367 full cycles 367 Hz ,意味着 367 个完整周期
  • 368 Hz , meaning 368 full cycles. 368 Hz ,表示 368 个完整周期。 IMPORTANT: Due to aliasing (see "Sampling Theorem") this is identical to -367 Hz , meaning, 367 Hz rotating in the opposite direction!重要提示:由于混叠(请参阅“采样定理”),这与-367 Hz相同,意思是367 Hz以相反的方向旋转!
  • -366 Hz
  • -365 Hz
  • ... ...
  • -1 Hz

In total those are 735 frequencies.总共有 735 个频率。 There isn't more information in the signal.信号中没有更多信息。

In your case, as your time period is not 1 second but 0.03 seconds, you need to multiply your frequencies by 1/0.03 = 33.33 .在您的情况下,由于您的时间段不是 1 秒而是 0.03 秒,因此您需要将频率乘以1/0.03 = 33.33 So you get:所以你得到:

  • 0 Hz
  • 33.33 Hz
  • 66.66 Hz
  • ... ...
  • 12200 Hz
  • 12233.33 Hz
  • -12233.33 Hz
  • -12200 Hz
  • ... ...
  • -66.66 Hz
  • -33.33 Hz

There simply isn't more information in your samples.您的样本中根本没有更多信息。

Additional important info:其他重要信息:

For the FFT, it looks like the signal you give to it is repeating endlessly.对于 FFT,看起来您提供给它的信号在无休止地重复。 So if your 735 samples aren't actually repeating (which I guess they aren't), you need to apply a window function to reduce the artifacts you get from odd frequencies.因此,如果您的 735 个样本实际上没有重复(我猜它们不是),您需要应用 window function 来减少从奇数频率获得的伪像。 For example, a clean 1.5 Hz signal in the 1 second case will give you some weird overtones because applying no window function is equivalent to applying a rectangular window function, which has horrible overtones.例如,在 1 秒的情况下,干净的1.5 Hz信号会给您一些奇怪的泛音,因为不应用 window function 等同于应用矩形 window function,它具有可怕的泛音。 More info here .更多信息在这里

For visual learners, I can strongly recommend the videos of 3Blue1Brown about the fourier transform .对于视觉学习者,我可以强烈推荐3Blue1Brown 关于傅里叶变换的视频

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM