以幀為單位計算 FFT 並寫入文件

Question

我是 python 的新手，我正在嘗試獲取上傳的 wav 文件的 FFT 值並返回文本文件每一行中每一幀的 FFT（使用 GCP）

使用 scipy 或 librosa

我需要的幀速率是 30fps

波形文件將是 48k 采樣率

所以我的問題是

我如何將整個 wav 文件的樣本划分為每一幀的樣本
~~如何添加空樣本以使幀樣本的長度為 2（如 48000/30 = 1600 添加 448 個空樣本使其成為 2048）~~
如何將生成的 FFT 數組標准化為 [-1,1]？

Answer 1

您可以使用帶有回調的 pyaudio 來實現您正在做的任何事情。

import pyaudio
import wave
import time
import struct
import sys
import numpy as np

if len(sys.argv) < 2:
   print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
   sys.exit(-1)

wf = wave.open(sys.argv[1], 'rb')

# instantiate PyAudio (1)
p = pyaudio.PyAudio()

def callback_test(data, frame_count, time_info, status):
    frame_count =1024
    elm = wf.readframes(frame_count) # read n frames
    da_i = np.frombuffer(elm, dtype='<i2') # convert to little endian int pairs
    da_fft = np.fft.rfft(da_i) # fast fourier transform for real values

    da_ifft = np.fft.irfft(da_fft)  # inverse fast fourier transform for real values 
    da_i = da_ifft.astype('<i2') # convert to little endian int pairs
    da_m = da_i.tobytes() # convert to bytes 
    return (da_m, pyaudio.paContinue)

# open stream using callback (3)
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
                channels=wf.getnchannels(),
                rate=wf.getframerate(),# sampling frequency
                output=True,
                stream_callback=callback_test)

# # start the stream (4)
stream.start_stream()

# # wait for stream to finish (5)
while stream.is_active():
    time.sleep(0.1)

# # stop stream (6)
stream.stop_stream()
stream.close()
wf.close()

# close PyAudio (7)
p.terminate()

請參閱這些鏈接以進行進一步研究：

https://people.csail.mit.edu/hubert/pyaudio/docs/#example-callback-mode-audio-io和Python 更改 wav 文件的音高

以幀為單位計算 FFT 並寫入文件

問題描述

1 個解決方案

解決方案1
3 已采納 2020-04-20 05:23:13

以幀為單位計算 FFT 並寫入文件

問題描述

1 個解決方案

解決方案1 3 已采納 2020-04-20 05:23:13

解決方案1
3 已采納 2020-04-20 05:23:13