繁体   English   中英

如何将 wav 文件填充到特定长度?

[英]How can i pad wav file to specific length?

我正在使用波形文件来制作不同长度的深度学习模型,所以我想使用 python 将它们全部填充到 16 秒长度

如果我理解正确,这个问题想要将所有长度固定到给定的长度。 因此,解决方案会略有不同:

from pydub import AudioSegment

pad_ms = 1000  # Add here the fix length you want (in milliseconds)
audio = AudioSegment.from_wav('you-wav-file.wav')
assert pad_ms > len(audio), "Audio was longer that 1 second. Path: " + str(full_path)
silence = AudioSegment.silent(duration=pad_ms-len(audio)+1)


padded = audio + silence  # Adding silence after the audio
padded.export('padded-file.wav', format='wav')

这个答案与这个答案不同,因为这个答案从相同的长度创建所有音频,而另一个在最后添加相同大小的静音。

使用pydub

from pydub import AudioSegment

pad_ms = 1000  # milliseconds of silence needed
silence = AudioSegment.silent(duration=pad_ms)
audio = AudioSegment.from_wav('you-wav-file.wav')

padded = audio + silence  # Adding silence after the audio
padded.export('padded-file.wav', format='wav')

AudioSegment对象是不可变的

您可以使用Librosa L ibrosa.util.fix_length函数通过在包含音频数据的 numpy 数组的末尾附加来向音频文件添加静音补丁:

from librosa import load
from librosa.util import fix_length


file_path = 'dir/audio.wav'

sf = 44100 # sampling frequency of wav file
required_audio_size = 5 # audio of size 2 second needs to be padded to 5 seconds
audio, sf = load(file_path, sr=sf, mono=True) # mono=True converts stereo audio to mono
padded_audio = fix_length(audio, size=5*sf) # array size is required_audio_size*sampling frequency


print('Array length before padding', np.shape(audio))
print('Audio length before padding in seconds', (np.shape(audio)[0]/fs))
print('Array length after padding', np.shape(padded_audio))
print('Audio length after padding in seconds', (np.shape(padded_audio)[0]/fs))

输出:

Array length before padding (88200,)
Audio length before padding in seconds 2.0
Array length after padding (220500,)
Audio length after padding in seconds 5.0

尽管在查看了许多类似的问题之后,似乎pydub.AudioSegment是解决方案。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM