![](/img/trans.png)
[英]Return data of Numpys FFT and finding amplitude and frequencies per second in audio file
[英]Getting max amplitude for an audio file per second
我知道這里有一些類似的問題,但大多數都與生成波形圖像有關,這不是我想要的。
我的目標是為音頻文件生成波形可視化,類似於 SoundCloud,但不是圖像。 我想要數組中音頻剪輯的每一秒(或半秒)的最大幅度數據。 然后我可以使用這些數據來創建基於 CSS 的可視化。
理想情況下,我希望得到一個數組,其中每秒的所有幅度值占整個音頻文件最大幅度的百分比。 這是一個例子:
[
0.0, # Relative max amplitude of first second of audio clip (0%)
0.04, # Relative max amplitude of second second of audio clip (4%)
0.15, # Relative max amplitude of third second of audio clip (15%)
# Some more
1.0, # The highest amplitude of the whole audio clip will be 1.0 (100%)
]
我假設我至少必須使用numpy
和 Python 的wave
模塊,但我不確定如何獲得我想要的數據。 我想使用 Python 但我並不完全反對使用某種命令行工具。
如果您允許 gstreamer,這里有一個小腳本可以解決問題。 它接受 gstreamer 可以處理的任何音頻文件。
片段:
import os, sys, pygst
pygst.require('0.10')
import gst, gobject
gobject.threads_init()
def get_peaks(filename):
global do_run
pipeline_txt = (
'filesrc location="%s" ! decodebin ! audioconvert ! '
'audio/x-raw-int,channels=1,rate=44100,endianness=1234,'
'width=32,depth=32,signed=(bool)True !'
'level name=level interval=1000000000 !'
'fakesink' % filename)
pipeline = gst.parse_launch(pipeline_txt)
level = pipeline.get_by_name('level')
bus = pipeline.get_bus()
bus.add_signal_watch()
peaks = []
do_run = True
def show_peak(bus, message):
global do_run
if message.type == gst.MESSAGE_EOS:
pipeline.set_state(gst.STATE_NULL)
do_run = False
return
# filter only on level messages
if message.src is not level or \
not message.structure.has_key('peak'):
return
peaks.append(message.structure['peak'][0])
# connect the callback
bus.connect('message', show_peak)
# run the pipeline until we got eos
pipeline.set_state(gst.STATE_PLAYING)
ctx = gobject.gobject.main_context_default()
while ctx and do_run:
ctx.iteration()
return peaks
def normalize(peaks):
_min = min(peaks)
_max = max(peaks)
d = _max - _min
return [(x - _min) / d for x in peaks]
if __name__ == '__main__':
filename = os.path.realpath(sys.argv[1])
peaks = get_peaks(filename)
print 'Sample is %d seconds' % len(peaks)
print 'Minimum is', min(peaks)
print 'Maximum is', max(peaks)
peaks = normalize(peaks)
print peaks
還有一個 output 示例:
$ python gstreamerpeak.py 01\ Tron\ Legacy\ Track\ 1.mp3
Sample is 182 seconds
Minimum is -349.999999922
Maximum is -2.10678956719
[0.0, 0.0, 0.9274581631597019, 0.9528318436488018, 0.9492396611762614,
0.9523404330322813, 0.9471685835966183, 0.9537281219301242, 0.9473486577135167,
0.9479292126411365, 0.9538221105563514, 0.9483845795252251, 0.9536790832823281,
0.9477264933378022, 0.9480077366961968, ...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.