音频数据字符串格式为numpy数组

Question

I am trying to convert audio sample rate (from 44100 to 22050) of a numpy.array with 88200 samples in which I have already done some process (such as add silence and convert to mono). 我正在尝试使用88200个样本将numpy.array的音频采样率（从44100转换为22050），其中我已经完成了一些处理（例如添加静音并转换为单声道）。 I tried to convert this array with audioop.ratecv and it work, but it return a str instead of a numpy array and when I wrote those data with scipy.io.wavfile.write the result was half of the data are lost and the audio speed is twice as fast (instead of slower, at least that would make kinda sense). 我尝试使用audioop.ratecv转换此数组，但它可以正常工作，但它返回的是str而不是numpy数组，当我用scipy.io.wavfile.write写入这些数据时，结果是一半的数据丢失了，音频速度是速度的两倍（而不是较慢，至少在某种意义上是这样）。 audio.ratecv works fine with str arrays such as wave.open returns, but I don't know how to process those, so I tried to convert from str to numpy with numpy.array2string(data) to pass this on ratecv and get correct results, and then convert again to numpy with numpy.fromstring(data, dtype) and now len of data is 8 samples. audio.ratecv可以很好地与str数组（例如wave.open返回）配合使用，但是我不知道如何处理它们，因此我尝试使用numpy.array2string(data)从str转换为numpy，以便将其传递给ratecv并获得正确的结果结果，然后再次使用numpy.fromstring(data, dtype)转换为numpy，现在len的数据是8个样本。 I think this is due to complication of formats, but I don't know how can I control it. 我认为这是由于格式复杂造成的，但我不知道如何控制它。 I also haven't figure out what kind of format str does wave.open returns so I can force format on this one. 我也没有弄清楚wave.open返回哪种格式，所以我可以强制使用这种格式。

Here is this part of my code 这是我的代码的这一部分

def conv_sr(data, srold, fixSR, dType, chan = 1): 
    state = None
    width = 2 # numpy.int16
    print "data shape", data.shape, type(data[0]) # returns shape 88200, type int16
    fragments = numpy.array2string(data)
    print "new fragments len", len(fragments), "type", type(fragments) # return len 30 type str
    fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
    print "fragments", len(fragments_new), type(fragments_new[0]) # returns 16, type str
    data_to_return = numpy.fromstring(fragments_new, dtype=dType)
    return data_to_return

and I call it like this 我这样称呼它

data1 = numpy.array(data1, dtype=dType)
data_to_copy = numpy.append(data1, data2)
data_to_copy = _to_copy.sum(axis = 1) / chan
data_to_copy = data_to_copy.flatten() # because its mono

data_to_copy = conv_sr(data_to_copy, sr, fixSR, dType) #sr = 44100, fixSR = 22050

scipy.io.wavfile.write(filename, fixSR, data_to_copy)

Answer 1

After a bit more of research I found my mistake, it seems that 16 bit audio are made of two 8 bit 'cells', so the dtype I was putting on was false and that's why I had audio speed issue. 经过更多的研究后，我发现了我的错误，似乎16位音频是由两个8位“单元”组成的，因此我所使用的dtype是错误的，这就是为什么出现音频速度问题。 I found the correct dtype here . 我在这里找到了正确的dtype。 So, in conv_sr def, I am passing a numpy array, convert it to data string, pass it to convert sample rate, converting again to numpy array for scipy.io.wavfile.write and finally, converting 2 8bits to 16 bit format 因此，在conv_sr def中，我传递了一个numpy数组，将其转换为数据字符串，传递给它以转换采样率，再次将其转换为scipy.io.wavfile.write numpy数组，最后将2个8位转换为16位格式

def widthFinder(dType):
    try:
        b = str(dType)
        bits = int(b[-2:])
    except:
        b = str(dType)
        bits = int(b[-1:])
    width = bits/8
    return width

def conv_sr(data, srold, fixSR, dType, chan = 1): 
    state = None
    width = widthFinder(dType)
    if width != 1 and width != 2 and width != 4:
        width = 2
    fragments = data.tobytes()
    fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
    fragments_dtype = numpy.dtype((numpy.int16, {'x':(numpy.int8,0), 'y':(numpy.int8,1)}))
    data_to_return = numpy.fromstring(fragments_new, dtype=fragments_dtype)
    data_to_return = data_to_return.astype(dType)
    return data_to_return

If you find anything wrong, please feel free to correct me, I am still a learner 如果您发现任何错误，请随时纠正我，我仍然是学习者

音频数据字符串格式为numpy数组

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-08-29 23:41:39

音频数据字符串格式为numpy数组

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-08-29 23:41:39

解决方案1
0 已采纳 2017-08-29 23:41:39