I am trying to make to make some speech detection code. I am trying to do this with zero crossing rate. I did some research and found that when someone is speaking, the zero crossing rate should be a medium kinda value, not too high and not too low, but when I speak into the microphone the zero crossing rate becomes higher than it was with just background noise(which there is barely any) this is what I am doing to calculate it right now.
((audioData[:-1] * audioData[1:]) < 0).sum()
audioData is a numpy table and it's content is the result of pyAudioStream.read() could anyone tell me the correct way to calculate this? Thanks
That's a lot of unnecessary multiplication. Using a Boolean comparison and running it through np.diff
will probably be faster:
zero_crosses = np.nonzero(np.diff(audioData > 0)))[0]
What this is doing:
audioData > 0
) np.diff
) so locations of zero crossings become 1 (rising) and -1 (falling) np.nonzero
). Then if you want the number of crossings, you can just take zero_crosses.size
.
As a bonus you have the timings of all the crosses so you can do things like a histogram that shows where more crosses are happening in your time history.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.