简体   繁体   中英

search a 2GB WAV file for dropouts using wave module

`What is the best way to analyze a 2GB WAV file (1khz Tone) for audio dropouts using wave module? I tried the script below

import wave
file1 = wave.open("testdropout.wav", "r")
file2 = open("silence.log", "w")
for i in xrange(file1.getnframes()):
  frame = file1.readframes(i)

  zero = True

  for j in xrange(len(frame)):
      # check if amplitude is greater than 0
      # the ord() function converts the hex values to integers
    if ord(frame[j]) > 0:
      zero = False
      break

    if zero:
      print >> file2, 'dropout at second %s' % (file1.tell()/file1.getframerate())
file1.close()
file2.close()

I think a simple solution to this would be to consider that the frame rate on audio files is pretty high. A sample file on my computer happens to have a framerate of 8,000. That means for every second of audio, I have 8,000 samples. If you have missing audio, I'm sure it will exist across multiple frames within a second, so you can essentially reduce your comparisons as drastically as your standards would allow. If I were you, I would try iterating over every 1,000th sample instead of every single sample in the audio file. That basically means it will examine every 1/8th of a second of audio to see if it's dead. Not as precise, but hopefully it will get the job done.

import wave

file1 = wave.open("testdropout.wav", "r")
file2 = open("silence.log", "w")
for i in range(file1.getnframes()):
  frame = file1.readframes(i)

  zero = True

  for j in range(0, len(frame), 1000):
      # check if amplitude is greater than 0
      # the ord() function converts the hex values to integers
    if ord(frame[j]) > 0:
      zero = False
      break

    if zero:
      print >> file2, 'dropout at second %s' % (file1.tell()/file1.getframerate())

file1.close()
file2.close()

At the moment, you're reading the entire file into memory, which is not ideal. If you look at the methods available for a "Wave_read" object, one of them is setpos(pos) , which sets the position of the file pointer to pos . If you update this position, you should be able to only keep the frame you want in memory at any given time, preventing errors. Below is a rough outline:

import wave

file1 = wave.open("testdropout.wav", "r")
file2 = open("silence.log", "w")

def scan_frame(frame):
    for j in range(len(frame)):
        # check if amplitude is less than 0
        # It makes more sense here to check for the desired case (low amplitude) 
        # rather than breaking at higher amplitudes 
        if ord(frame[j]) <= 0:
            return True

for i in range(file1.getnframes()):
    frame = file1.readframes(1) # only read the frame at the current file position

    zero = scan_frame(frame)

    if zero:
       print >> file2, 'dropout at second %s' % (file1.tell()/file1.getframerate())

    pos = file1.tell()  # States current file position
    file1.setpos(pos + len(frame)) # or pos + 1, or whatever a single unit in a wave 
                                   # file is, I'm not entirely sure

file1.close()
file2.close()

Hope this can help!

I haven't used the wave module before, but file1.readframes(i) looks like it's reading 1 frame when you're at the first frame, 2 frames when you're at the second frame, 10 frames when you're in the tenth frame, and a 2Gb CD quality file might have a million frames - by the time you're at frame 100,000 reading 100,000 frames ... getting slower each time through the loop as well?

And from my comment, in Python 2 range() generates an in-memory array of the full size first, and xrange() doesn't, but not using range at all helps even more.

And push the looping down into the lower layers with any() to make the code shorter, and possibly faster:

import wave

file1 = wave.open("testdropout.wav", "r")
file2 = open("silence.log", "w")

chunksize = file1.getframerate()
chunk = file1.readframes(chunksize)

while chunk:
  if not any(ord(sample) for sample in chunk):
    print >> file2, 'dropout at second %s' % (file1.tell()/chunksize)

  chunk = file1.readframes(chunksize)

file1.close()
file2.close()

This should read the file in 1-second chunks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM