Python：對音樂文件執行FFT

Question

我正在嘗試對一首歌曲（wav格式的音頻文件，大約3分鍾）執行FFT，我創建如下，以防它是相關的。

ffmpeg -i "$1" -vn -ab 128k -ar 44100 -y -ac 1 "${1%.webm}.wav"

其中$1是webm文件的名稱。

這是應該顯示給定文件的FFT的代碼：

import numpy as np
import matplotlib.pyplot as plt

# presume file already converted to wav.
file = os.path.join(temp_folder, file_name)

rate, aud_data = scipy.io.wavfile.read(file)

# wav file is mono.
channel_1 = aud_data[:]

fourier = np.fft.fft(channel_1)

plt.figure(1)
plt.plot(fourier)
plt.xlabel('n')
plt.ylabel('amplitude')
plt.show()

問題是，這需要永遠。 我需要很長時間才能顯示輸出，因為我有足夠的時間研究和撰寫這篇文章但它還沒有完成。

我認為文件太長了，因為

print (aud_data.shape)

輸出(9218368,) ，但這看起來像一個真實世界的問題，所以我希望有辦法以某種方式獲得音頻文件的FFT。

我究竟做錯了什么？ 謝謝。

編輯

問題的一個更好的表述是：音樂處理的任何好處的FFT？ 例如2件的相似性。

正如評論中指出的那樣，我的簡單方法太慢了。

謝謝。

Answer 1

為了大大加快分析的fft部分，您可以將數據填充到2的冪：

import numpy as np
import matplotlib.pyplot as plt

# rate, aud_data = scipy.io.wavfile.read(file)
rate, aud_data = 44000, np.random.random((9218368,))

len_data = len(aud_data)

channel_1 = np.zeros(2**(int(np.ceil(np.log2(len_data)))))
channel_1[0:len_data] = aud_data

fourier = np.fft.fft(channel_1)

下面是使用上述方法繪制幾個正弦波的傅立葉變換的實部的示例：

import numpy as np
import matplotlib.pyplot as plt

# rate, aud_data = scipy.io.wavfile.read(file)
rate = 44000
ii = np.arange(0, 9218368)
t = ii / rate
aud_data = np.zeros(len(t))
for w in [1000, 5000, 10000, 15000]:
    aud_data += np.cos(2 * np.pi * w * t)

# From here down, everything else can be the same
len_data = len(aud_data)

channel_1 = np.zeros(2**(int(np.ceil(np.log2(len_data)))))
channel_1[0:len_data] = aud_data

fourier = np.fft.fft(channel_1)
w = np.linspace(0, 44000, len(fourier))

# First half is the real component, second half is imaginary
fourier_to_plot = fourier[0:len(fourier)//2]
w = w[0:len(fourier)//2]

plt.figure(1)

plt.plot(w, fourier_to_plot)
plt.xlabel('frequency')
plt.ylabel('amplitude')
plt.show()

Python：對音樂文件執行FFT

問題描述

1 個解決方案

解決方案1
3 已采納 2017-12-28 22:39:49

Python：對音樂文件執行FFT

問題描述

1 個解決方案

解決方案1 3 已采納 2017-12-28 22:39:49

解決方案1
3 已采納 2017-12-28 22:39:49