[英]Invertible STFT and ISTFT in Python
Is there any general-purpose form of short-time Fourier transform with corresponding inverse transform built into SciPy or NumPy or whatever? 是否有任何通用形式的短时傅里叶变换,其中相应的逆变换内置于SciPy或NumPy或其他任何东西?
There's the pyplot specgram
function in matplotlib, which calls ax.specgram()
, which calls mlab.specgram()
, which calls _spectral_helper()
: 在matplotlib中有pyplot
specgram
函数,它调用ax.specgram()
,它调用mlab.specgram()
,调用_spectral_helper()
:
#The checks for if y is x are so that we can use the same function to #implement the core of psd(), csd(), and spectrogram() without doing #extra calculations. We return the unaveraged Pxy, freqs, and t.
but 但
This is a helper function that implements the commonality between the 204 #psd, csd, and spectrogram.
这是一个辅助函数,它实现了204#psd,csd和谱图之间的通用性。 It is NOT meant to be used outside of mlab
它并不意味着在mlab之外使用
I'm not sure if this can be used to do an STFT and ISTFT, though. 不过,我不确定这是否可以用来做STFT和ISTFT。 Is there anything else, or should I translate something like these MATLAB functions ?
还有什么,或者我应该翻译这些MATLAB函数吗?
I know how to write my own ad-hoc implementation; 我知道如何编写自己的临时实现; I'm just looking for something full-featured, which can handle different windowing functions (but has a sane default), is fully invertible with COLA windows (
istft(stft(x))==x
), tested by multiple people, no off-by-one errors, handles the ends and zero padding well, fast RFFT implementation for real input, etc. 我只是在寻找功能齐全的东西,它可以处理不同的窗口函数(但是有一个合理的默认值),完全可以与COLA窗口(
istft(stft(x))==x
)完全istft(stft(x))==x
,由多人测试,没有逐个错误,处理结束和零填充,实际输入的快速RFFT实现等。
Here is my Python code, simplified for this answer: 这是我的Python代码,简化了这个答案:
import scipy, pylab
def stft(x, fs, framesz, hop):
framesamp = int(framesz*fs)
hopsamp = int(hop*fs)
w = scipy.hanning(framesamp)
X = scipy.array([scipy.fft(w*x[i:i+framesamp])
for i in range(0, len(x)-framesamp, hopsamp)])
return X
def istft(X, fs, T, hop):
x = scipy.zeros(T*fs)
framesamp = X.shape[1]
hopsamp = int(hop*fs)
for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
return x
Notes: 笔记:
blkproc
in Matlab. blkproc
。 Instead of a for
loop, I apply a command (eg, fft
) to each frame of the signal inside a list comprehension, and then scipy.array
casts it to a 2D-array. fft
)应用于列表scipy.array
每个信号帧,而不是for
循环,然后scipy.array
将其转换为2D数组。 I use this to make spectrograms, chromagrams, MFCC-grams, and much more. istft
. istft
使用了一个天真的重叠和添加方法。 In order to reconstruct the original signal the sum of the sequential window functions must be constant, preferably equal to unity (1.0). hanning
) window and a 50% overlap which works perfectly. hanning
)窗口和50%的重叠,完美无缺。 See this discussion for more information. A test: 一个测试:
if __name__ == '__main__':
f0 = 440 # Compute the STFT of a 440 Hz sinusoid
fs = 8000 # sampled at 8 kHz
T = 5 # lasting 5 seconds
framesz = 0.050 # with a frame size of 50 milliseconds
hop = 0.025 # and hop size of 25 milliseconds.
# Create test signal and STFT.
t = scipy.linspace(0, T, T*fs, endpoint=False)
x = scipy.sin(2*scipy.pi*f0*t)
X = stft(x, fs, framesz, hop)
# Plot the magnitude spectrogram.
pylab.figure()
pylab.imshow(scipy.absolute(X.T), origin='lower', aspect='auto',
interpolation='nearest')
pylab.xlabel('Time')
pylab.ylabel('Frequency')
pylab.show()
# Compute the ISTFT.
xhat = istft(X, fs, T, hop)
# Plot the input and output signals over 0.1 seconds.
T1 = int(0.1*fs)
pylab.figure()
pylab.plot(t[:T1], x[:T1], t[:T1], xhat[:T1])
pylab.xlabel('Time (seconds)')
pylab.figure()
pylab.plot(t[-T1:], x[-T1:], t[-T1:], xhat[-T1:])
pylab.xlabel('Time (seconds)')
Here is the STFT code that I use. 这是我使用的STFT代码。 STFT + ISTFT here gives perfect reconstruction (even for the first frames).
STFT + ISTFT在这里提供了完美的重建 (即使是第一帧)。 I slightly modified the code given here by Steve Tjoa : here the magnitude of the reconstructed signal is the same as that of the input signal.
我略微修改了Steve Tjoa给出的代码:这里重建信号的幅度与输入信号的幅度相同。
import scipy, numpy as np
def stft(x, fftsize=1024, overlap=4):
hop = fftsize / overlap
w = scipy.hanning(fftsize+1)[:-1] # better reconstruction with this trick +1)[:-1]
return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])
def istft(X, overlap=4):
fftsize=(X.shape[1]-1)*2
hop = fftsize / overlap
w = scipy.hanning(fftsize+1)[:-1]
x = scipy.zeros(X.shape[0]*hop)
wsum = scipy.zeros(X.shape[0]*hop)
for n,i in enumerate(range(0, len(x)-fftsize, hop)):
x[i:i+fftsize] += scipy.real(np.fft.irfft(X[n])) * w # overlap-add
wsum[i:i+fftsize] += w ** 2.
pos = wsum != 0
x[pos] /= wsum[pos]
return x
librosa.core.stft
and istft
look pretty similar to what I was looking for, though they didn't exist at the time: librosa.core.stft
和istft
看起来与我想要的非常相似,尽管它们当时不存在:
librosa.core.stft(y, n_fft=2048, hop_length=None, win_length=None, window=None, center=True, dtype=<type 'numpy.complex64'>)
They don't invert exactly, though; 但它们并没有完全颠倒; the ends are tapered.
两端是锥形的。
Neither of the above answers worked well OOTB for me. 上述答案都不适合OOTB。 So I modified Steve Tjoa's.
所以我修改了Steve Tjoa的。
import scipy, pylab
import numpy as np
def stft(x, fs, framesz, hop):
"""
x - signal
fs - sample rate
framesz - frame size
hop - hop size (frame size = overlap + hop size)
"""
framesamp = int(framesz*fs)
hopsamp = int(hop*fs)
w = scipy.hamming(framesamp)
X = scipy.array([scipy.fft(w*x[i:i+framesamp])
for i in range(0, len(x)-framesamp, hopsamp)])
return X
def istft(X, fs, T, hop):
""" T - signal length """
length = T*fs
x = scipy.zeros(T*fs)
framesamp = X.shape[1]
hopsamp = int(hop*fs)
for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
# calculate the inverse envelope to scale results at the ends.
env = scipy.zeros(T*fs)
w = scipy.hamming(framesamp)
for i in range(0, len(x)-framesamp, hopsamp):
env[i:i+framesamp] += w
env[-(length%hopsamp):] += w[-(length%hopsamp):]
env = np.maximum(env, .01)
return x/env # right side is still a little messed up...
Found another STFT, but no corresponding inverse function: 找到另一个STFT,但没有相应的反函数:
http://code.google.com/p/pytfd/source/browse/trunk/pytfd/stft.py http://code.google.com/p/pytfd/source/browse/trunk/pytfd/stft.py
def stft(x, w, L=None):
...
return X_stft
I also found this on GitHub, but it seems to operate on pipelines instead of normal arrays: 我也在GitHub上发现了这个,但它似乎在管道而不是普通数组上运行:
http://github.com/ronw/frontend/blob/master/basic.py#LID281 http://github.com/ronw/frontend/blob/master/basic.py#LID281
def STFT(nfft, nwin=None, nhop=None, winfun=np.hanning):
...
return dataprocessor.Pipeline(Framer(nwin, nhop), Window(winfun),
RFFT(nfft))
def ISTFT(nfft, nwin=None, nhop=None, winfun=np.hanning):
...
return dataprocessor.Pipeline(IRFFT(nfft), Window(winfun),
OverlapAdd(nwin, nhop))
I think scipy.signal has what you are looking for. 我认为scipy.signal有你想要的东西。 It has reasonable defaults, supports multiple window types, etc...
它有合理的默认值,支持多种窗口类型等...
http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.signal.spectrogram.html http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.signal.spectrogram.html
from scipy.signal import spectrogram
freq, time, Spec = spectrogram(signal)
A fixed version of basj's answer. basj答案的固定版本。
import scipy, numpy as np
import matplotlib.pyplot as plt
def stft(x, fftsize=1024, overlap=4):
hop=fftsize//overlap
w = scipy.hanning(fftsize+1)[:-1] # better reconstruction with this trick +1)[:-1]
return np.vstack([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])
def istft(X, overlap=4):
fftsize=(X.shape[1]-1)*2
hop=fftsize//overlap
w=scipy.hanning(fftsize+1)[:-1]
rcs=int(np.ceil(float(X.shape[0])/float(overlap)))*fftsize
print(rcs)
x=np.zeros(rcs)
wsum=np.zeros(rcs)
for n,i in zip(X,range(0,len(X)*hop,hop)):
l=len(x[i:i+fftsize])
x[i:i+fftsize] += np.fft.irfft(n).real[:l] # overlap-add
wsum[i:i+fftsize] += w[:l]
pos = wsum != 0
x[pos] /= wsum[pos]
return x
a=np.random.random((65536))
b=istft(stft(a))
plt.plot(range(len(a)),a,range(len(b)),b)
plt.show()
如果您可以访问可以执行所需操作的C二进制库,请使用http://code.google.com/p/ctypesgen/生成该库的Python界面。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.