為什么最高 FFT 峰值不是音調的基頻？

Question

目前，我正在努力為這個閃爍的小星星文件爭取音調。 在大多數情況下，音符的頻率是正確的，我們通過變量 index_max 獲得。 但是，對於 C5 的音符，它返回的是 C6。 C5 的頻率約為 523，而 C6 的頻率約為 1046。FFT 告訴我們頻率比預期結果高一個八度。 這實際上發生在許多其他文件中，並且似乎注釋越低，出現問題的可能性就越大。 任何有關提出此問題或答案的更好方法的說明將不勝感激！

import scipy.io.wavfile as wave
import numpy as np
from frequencyUtil import *
from scipy.fft import fft, ifft

def read_data(scale):
        infile = "twinkle.wav"
        rate, data = wave.read(infile)
        sample_rate = int(rate/scale)
        time_frames = [data[i:i + sample_rate] for i in range(0, len(data), sample_rate)]
        notes = []
        for x in range(len(time_frames)):                               # for each section, get the FFT
                if(type(data[0]) is np.int16):                               # If not dual channel process like normal
                        dataZero = np.array(time_frames[x])
                else:                                                   # if is dual channel get first ele of every list
                        data = np.array(time_frames[x])  # convert to np array
                        dataZero = [row[0] for row in data]
                frequencies = fft(dataZero)                          # get the FFT of the wav file

                inverse = ifft(np.real(frequencies))

                index_max = np.argmax(np.abs(frequencies[0:8800//scale]))      # get the index of the max number within music range
                #print(abs(frequencies[index_max]))
                # filters out the amplitudes that are lower than this value found through testing
                # should eventually understand the scale of the fft frequencies
                if(abs(frequencies[index_max]) < 4000000/scale):
                       continue
                index_max = index_max*scale
                print(index_max)
                notes.append(index_max)
        return notes```

Answer 1

許多音高的聲音（尤其是低音）在頻譜中具有比基本音高更強的泛音或諧波。 這些泛音使樂器或聲音聽起來比正弦波發生器更有趣。 但由於音高是一種心理聲學現象，人腦會做出必要的修正來感知音高。

因此，FFT 幅度矢量中最強的頻譜峰值通常不在基頻處，因為音調具有非平凡的頻譜。

有大量關於音高檢測和估計問題的學術論文和文章。 許多使用倒譜/倒譜、自相關、機器學習等方法。

為什么最高 FFT 峰值不是音調的基頻？

問題描述

1 個解決方案

解決方案1
1 2020-07-03 17:38:01

為什么最高 FFT 峰值不是音調的基頻？

問題描述

1 個解決方案

解決方案1 1 2020-07-03 17:38:01

解決方案1
1 2020-07-03 17:38:01