使用 Python 的 WAV 文件修改器

Question

我編寫了一個簡單的 Python 程序來讀取波形文件，並在更改后將其存儲為一個新文件。

import codecs, wave

#convert a number to its two's complemented value (For positive it is equal itself)
def convert_to_twos(value, wid_len=16):
    if value < 0 :
        value = value + (1 << wid_len)
    return value

#receive the value of a two's complemented number.
def twos_back_value(value, wid_len=16):
    if value & (1 << wid_len -1):
        value = value - (1 << wid_len)
    return value

#opening files
input_file = wave.open(r"<address of input wave file>", 'r')
output_file = wave.open(r"<an address for output wave file>", 'w')

#Get input file parameters and set them to the output file after modifing the channel number.
out_params = [None, None, None, None, None, None]
in_params = input_file.getparams()
out_params[0] = 1 # I want to have a mono type wave file in output. so I set the channels = 1
out_params[1] = in_params[1] #Frame Width
out_params[2] = in_params[2] #Sample Rate
out_params[3] = in_params[3] #Number of Frames
out_params[4] = in_params[4] #Type
out_params[5] = in_params[5] #Compressed or not
output_file.setparams(out_params)

#reading frames from first file and storing in the second file
for frame in range(out_params[2]):
    value = int(codecs.getencoder('hex')(input_file.readframes(1))[0][:4], 16) #converting first two bytes of each frame (let assume each channel has two bytes frame length) to int (from byte string).
    t_back_value = twos_back_value( value ,out_params[1]*8)
    new_value = int(t_back_value * 1)
    new_twos = convert_to_twos(new_value, out_params[1]*8)
    to_write = new_twos.to_bytes((new_twos.bit_length() + 7) // 8, 'big')
    output_file.writeframes(to_write)


#closing files
input_file.close()
output_file.close()

問題是當我運行上述程序並播放輸出文件時，我只能聽到噪音而沒有其他聲音！ （雖然我只希望在一種通道模式下使用相同的文件！）

更新：

我得到了一些奇怪的東西。 根據文檔，函數readframes(n)讀取並返回最多 n 幀音頻，作為字節字符串。 所以我期望這個函數只返回十六進制值。 但實際上我可以看到一些奇怪的非十六進制值：

read_frame = input_file.readframes(1)
print (read_frame)
print (codecs.getencoder('hex')(read_frame)[0])
print ("")

上面的代碼，在 for 循環中返回：

b'\xe3\x00\xc7\xf5'
b'e300c7f5'

b'D\xe8\xa1\xfd'
b'44e8a1fd'

b'\xde\x08\xb2\x1c'
b'de08b21c'

b'\x17\xea\x10\xe9'
b'17ea10e9'

b'{\xf7\xbc\xf5'
b'7bf7bcf5'

b'*\xf6K\x08'
b'2af64b08'

如您所見， read_frame有一些非十六進制值！ (*, }, D, ... 例如)。 這些是什么？

Answer 1

您看到的值是每個幀的四個字節，即第一個通道的兩個字節和第二個通道的兩個字節。 對於單聲道 WAV，您只會看到兩個字節。

以下方法應該讓您走上正確的道路。 您需要使用 Python 的struct庫將二進制幀值轉換為有符號整數。 然后，您可以根據需要操作它們。 對於我的例子，我只是乘以 2/3：

import wave
import codecs
import struct

#opening files
input_file = wave.open(r"sample.wav", 'rb')
output_file = wave.open(r"sample_out.wav", 'wb')

#Get input file parameters and set them to the output file after modifing the channel number.
in_params = list(input_file.getparams())

out_params = in_params[:]
out_params[0] = 1
output_file.setparams(out_params)

nchannels, sampwidth, framerate, nframes, comptype, compname = in_params
format = '<{}h'.format(nchannels)

#reading frames from first file and storing in the second file
for index in range(nframes):
    frame = input_file.readframes(1)
    data = struct.unpack(format, frame)
    value = data[0]     # first (left) channel only
    value = (value * 2) // 3    # apply a simple function to each value
    output_file.writeframes(struct.pack('<h', value))

#closing files
input_file.close()
output_file.close()

請注意，像這樣一次一幀處理一個波形文件會非常緩慢。 可以通過減少對writeframes的調用次數來加快速度。

format保存解包二進制值所需的格式。 對於 2 通道 WAV 文件，這將包含 4 個字節。 然后format將被配置為<hh ，這意味着使用struct.unpack將產生兩個字段，每個字段包含每個通道的整數表示。 所以四個字節變成了兩個整數的列表，每個通道一個。

使用 Python 的 WAV 文件修改器

問題描述

1 個解決方案

解決方案1
1 已采納 2015-11-23 09:52:04

使用 Python 的 WAV 文件修改器

問題描述

1 個解決方案

解決方案1 1 已采納 2015-11-23 09:52:04

解決方案1
1 已采納 2015-11-23 09:52:04