[英]Audio data string format to numpy array
I am trying to convert audio sample rate (from 44100 to 22050) of a numpy.array with 88200 samples in which I have already done some process (such as add silence and convert to mono). 我正在尝试使用88200个样本将numpy.array的音频采样率(从44100转换为22050),其中我已经完成了一些处理(例如添加静音并转换为单声道)。 I tried to convert this array with
audioop.ratecv
and it work, but it return a str instead of a numpy array and when I wrote those data with scipy.io.wavfile.write
the result was half of the data are lost and the audio speed is twice as fast (instead of slower, at least that would make kinda sense). 我尝试使用
audioop.ratecv
转换此数组,但它可以正常工作,但它返回的是str而不是numpy数组,当我用scipy.io.wavfile.write
写入这些数据时,结果是一半的数据丢失了,音频速度是速度的两倍(而不是较慢,至少在某种意义上是这样)。 audio.ratecv
works fine with str arrays such as wave.open
returns, but I don't know how to process those, so I tried to convert from str to numpy with numpy.array2string(data)
to pass this on ratecv and get correct results, and then convert again to numpy with numpy.fromstring(data, dtype)
and now len of data is 8 samples. audio.ratecv
可以很好地与str数组(例如wave.open
返回)配合使用,但是我不知道如何处理它们,因此我尝试使用numpy.array2string(data)
从str转换为numpy,以便将其传递给ratecv并获得正确的结果结果,然后再次使用numpy.fromstring(data, dtype)
转换为numpy,现在len的数据是8个样本。 I think this is due to complication of formats, but I don't know how can I control it. 我认为这是由于格式复杂造成的,但我不知道如何控制它。 I also haven't figure out what kind of format str does
wave.open
returns so I can force format on this one. 我也没有弄清楚
wave.open
返回哪种格式,所以我可以强制使用这种格式。
Here is this part of my code 这是我的代码的这一部分
def conv_sr(data, srold, fixSR, dType, chan = 1):
state = None
width = 2 # numpy.int16
print "data shape", data.shape, type(data[0]) # returns shape 88200, type int16
fragments = numpy.array2string(data)
print "new fragments len", len(fragments), "type", type(fragments) # return len 30 type str
fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
print "fragments", len(fragments_new), type(fragments_new[0]) # returns 16, type str
data_to_return = numpy.fromstring(fragments_new, dtype=dType)
return data_to_return
and I call it like this 我这样称呼它
data1 = numpy.array(data1, dtype=dType)
data_to_copy = numpy.append(data1, data2)
data_to_copy = _to_copy.sum(axis = 1) / chan
data_to_copy = data_to_copy.flatten() # because its mono
data_to_copy = conv_sr(data_to_copy, sr, fixSR, dType) #sr = 44100, fixSR = 22050
scipy.io.wavfile.write(filename, fixSR, data_to_copy)
After a bit more of research I found my mistake, it seems that 16 bit audio are made of two 8 bit 'cells', so the dtype I was putting on was false and that's why I had audio speed issue. 经过更多的研究后,我发现了我的错误,似乎16位音频是由两个8位“单元”组成的,因此我所使用的dtype是错误的,这就是为什么出现音频速度问题。 I found the correct dtype here .
我在这里找到了正确的dtype。 So, in conv_sr def, I am passing a numpy array, convert it to data string, pass it to convert sample rate, converting again to numpy array for
scipy.io.wavfile.write
and finally, converting 2 8bits to 16 bit format 因此,在conv_sr def中,我传递了一个numpy数组,将其转换为数据字符串,传递给它以转换采样率,再次将其转换为
scipy.io.wavfile.write
numpy数组,最后将2个8位转换为16位格式
def widthFinder(dType):
try:
b = str(dType)
bits = int(b[-2:])
except:
b = str(dType)
bits = int(b[-1:])
width = bits/8
return width
def conv_sr(data, srold, fixSR, dType, chan = 1):
state = None
width = widthFinder(dType)
if width != 1 and width != 2 and width != 4:
width = 2
fragments = data.tobytes()
fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
fragments_dtype = numpy.dtype((numpy.int16, {'x':(numpy.int8,0), 'y':(numpy.int8,1)}))
data_to_return = numpy.fromstring(fragments_new, dtype=fragments_dtype)
data_to_return = data_to_return.astype(dType)
return data_to_return
If you find anything wrong, please feel free to correct me, I am still a learner 如果您发现任何错误,请随时纠正我,我仍然是学习者
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.