"python librosa 包 - 如何從頻譜中提取音頻"

Question

"

Answer 1

您可能已經注意到S_foreground來自S_full ，它來自一個名為magphase的函數。 根據有關此功能的文件，它可以

將復值譜圖D分離為其幅度（S）和相位（P）分量，使得D = S * P.

由於magphase采取的實際參數

S_full, phase = librosa.magphase(librosa.stft(y))

是stft(y)這是短時傅立葉變換的y ，初始ndarray ，我想你需要做的是計算一個新的D ：

D_foreground = S_foreground * phase

並將其拋給Inverse stft函數（ librosa.istft ）：

y_foreground = librosa.istft(D_foreground)

之后，您可以使用輸出功能：

librosa.output.write_wav(output_file_path, y_foreground, sr)

說實話，我不熟悉這些理論上的東西（使用這種方法的輸出質量差可能是一個證明），但上面我猜測你應該如何導出你的音頻。 事實證明，保真度非常差（至少在我的情況下），所以如果你真的關心音頻質量，你可能想嘗試其他一些軟件。

Answer 2

@Alioth 的答案是有效的，除了：

librosa.output.write_wav(output_file_path, y_foreground, sr)