I read a 24-bit mono audio .wav file into an array of type <i4
( <i3
doesn't exist)
data = numpy.fromfile(fid, dtype=`<i4`, count=size//3)
What should I do in order to get the audio samples properly ? Should I swap bytes order of something like this, how ?
You can convert the data into a numpy array of uint8
, then add the 0 to each sample by using reshape
and hstack
;
In [1]: import numpy as np
I'm using a generated sequence here as an example.
In [2]: a = np.array([1,2,3]*10, dtype=np.uint8)
In [3]: a
Out[3]:
array([1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2,
3, 1, 2, 3, 1, 2, 3], dtype=uint8)
In [4]: a = a.reshape((-1,3))
Reshape allows you to group the samples:
In [5]: a
Out[5]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]], dtype=uint8)
Make the zeros that have to be added.
In [6]: b = np.zeros(10, dtype=np.uint8).reshape((-1,1))
In [7]: b
Out[7]:
array([[0],
[0],
[0],
[0],
[0],
[0],
[0],
[0],
[0],
[0]], dtype=uint8)
Now we add the zeroes. Assuming you're using a little-endian system, the added zero goes at the front, to scale the data.
(I hope I got this endianness stuff right. If the sample now sounds very faint, I got it wrong and you need to use (a,b)
instead of (b,a)
)
In [8]: c = np.hstack((b, a))
In [9]: c
Out[9]:
array([[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]], dtype=uint8)
Reshape it back.
In [10]: c.reshape((1,-1))
Out[10]:
array([[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1,
2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]], dtype=uint8)
Convert to bytes:
In [11]: bytearray(c.reshape((1,-1)))
bytearray(b'\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03')
Now you have 4-byte samples.
Here is the solution for reading 24 bits files (thanks to Warren Weckesser's gist https://gist.github.com/WarrenWeckesser/7461781 ) :
data = numpy.fromfile(fid, dtype='u1', count=size) # first read byte per byte
a = numpy.empty((len(data)/3, 4), dtype=`u1`)
a[:, :3] = data.reshape((-1, 3))
a[:, 3:] = (a[:, 3 - 1:3] >> 7) * 255
data = a.view('<i4').reshape(a.shape[:-1])
This can be directly inserted in def _read_data_chunk(fid, noc, bits):
( scipy\\io\\wavfile.py
).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.