如何获得每个通道的平均和最大音量的立体声WAV文件？例如，每秒获取阵列中的卷？

Question

There is a code: 有一个代码：

import wave
import numpy as np
import math

wav = wave.open("music.wav", mode="r")
(nchannels, sampwidth, framerate, nframes, comptype, compname) = wav.getparams()

content = wav.readframes(nframes)
samples = np.fromstring(content, dtype=types[sampwidth])

for n in range(nchannels):
    channel = samples[n::nchannels]
    print channel

As a result: 结果是：

[0 0 0 ..., 0 0 8]
[0 0 0 ..., 0 0 0]

type: 类型：

<type 'numpy.ndarray'>
<type 'numpy.ndarray'>

I can not figure out what to do next ... I will be glad to another solution :) 我不知道下一步该怎么做...我将很高兴找到另一个解决方案:)

Answer 1

Not sure about your second question, but for the first... 不确定第二个问题，但第一个问题...

If you have an nd numpy array in samples: 如果样本中有一个nd numpy数组：

samples
array([[   1,    3],
   [   2,    2],
   [   3,    4],
   [   4,    5],
   [   5,  100],
   [   6, 1000],
   [   7,    0],
   [   8,    1]]

mean1 = samples.mean(axis=1)
max1 = samples.max(axis=1)

outputWav = numpy.vstack((mean1,max1)).T

Then write out this file, being careful of rounding issues going from floats to ints. 然后写出该文件，注意将问题从四舍五入到整数。

Answer 2

Solutions: 解决方案：

# first channel
samples_o = samples[0::2]
# second channel
samples_c = samples[1::2]

# for 3 second 24000 = 8000*3
gr_size = len(samples_o) // gr_count 

lst = [lst[i:i+gr_size] for i in range(0, len(samples_o), gr_size)]

agr = []

for array in lst:
  max_el = np.argmax(array, axis=0)
  agr.append(max_el)

print np.mean(agr, axis=0) # avg max volume for first channel

如何获得每个通道的平均和最大音量的立体声WAV文件？例如，每秒获取阵列中的卷？

问题描述

2 个解决方案

解决方案1
1 2013-10-30 14:09:49

解决方案2
0 已采纳 2013-11-07 09:19:46

如何获得每个通道的平均和最大音量的立体声WAV文件？ 例如，每秒获取阵列中的卷？

问题描述

2 个解决方案

解决方案1 1 2013-10-30 14:09:49

解决方案2 0 已采纳 2013-11-07 09:19:46

如何获得每个通道的平均和最大音量的立体声WAV文件？例如，每秒获取阵列中的卷？

解决方案1
1 2013-10-30 14:09:49

解决方案2
0 已采纳 2013-11-07 09:19:46