I'm working on a the Pitch Class Profile as proposed by Takuya Fujishima. I've done my best to implement this equation (using scipy and numpy); however, I've been getting some rather odd results. I've debated on putting this on DSP, but figured this is more of a coding problem than an equation understanding problem.
In any case, here is my code.
import scipy.io.wavfile
import numpy as np
import math
import sys
class PCP:
def __init__(self):
self.note_references = [16.35, 17.32, 18.35, 19.45, 20.60, 21.83, 23.12, 24.50, 25.96, 27.50, 29.14, 30.87]
self.results = {}
def create_fft(self, filename):
self.rate, self.data = scipy.io.wavfile.read('fmin.wav')
print "Data from the File: \n", self.data
self.frames = self.data.size
print "Number of Frames: ", self.frames
print "Rate: ", self.rate
self.fft_results = np.fft.rfft(self.data) ##fft computing and normalization
print "Results from the FFT: \n", self.fft_results
# The work of the following classes was almost entirely based on a
# thread in DSP. Here is the link to the particular article
# http://dsp.stackexchange.com/questions/13722/pitch-class-profiling
# This function returns the values of the notes given the spectrograph
def m_func(self, l, p):
#M(l) = round(12 * log_2( (f_s*l)/(N*f_ref) ) ) % 12
#print "L: ", l
#print "Note: ", p
a = self.rate * l
b = self.frames * self.note_references[p]
c = 12 * np.log2(a/b)
d = np.round(c)
e = np.mod(d.all(), 12)
#print "Result: ", e
#raw_input()
return e
def pcp(self, p):
r = 0
for l in self.fft_results:
result = self.m_func(l[0], p)
#print "actual returned result", result
if result == p:
r+=1
#print "There was a match! Add it!"
return r
def calculate_PCP(self):
for p in range(0,11): #for all 12 notes
self.results[p] = self.pcp(p)
def print_results(self):
for i in self.results.keys():
print i , ":" , self.results[i]
def main():
m = PCP()
m.create_fft("fmin.wav")
m.calculate_PCP()
m.print_results()
if __name__ == '__main__':
main()
Here are the outputs:
Data from the File:
[[16 15]
[ 9 9]
[15 15]
...,
[ 0 0]
[ 0 0]
[ 0 0]]
Number of Frames: 352800
Rate: 44100
Results from the FFT:
[[ 31.+0.j 1.+0.j]
[ 18.+0.j 0.+0.j]
[ 30.+0.j 0.+0.j]
...,
[ 0.+0.j 0.+0.j]
[ 0.+0.j 0.+0.j]
[ 0.+0.j 0.+0.j]]
PCP.py:36: RuntimeWarning: divide by zero encountered in log2
c = 12 * np.log2(a/b)
PCP.py:36: RuntimeWarning: invalid value encountered in cdouble_scalars
c = 12 * np.log2(a/b)
0 : 143
1 : 176263
2 : 0
3 : 0
4 : 0
5 : 0
6 : 0
7 : 0
8 : 0
9 : 0
10 : 0
The file contains a piano playing an F-minor chord (responding with 0, 5, and 7 in the results dictionary). The results, however, indicate a very strong presence for C#/Db, and I can certainly verify there is no C# in the recording. I'd appreciate any and all help!
Pitch frequency is different from spectral frequency, and thus not equal to the content of every 12th fft magnitude result bin (especially for recordings of actual musical sounds). If nothing else, any strong odd harmonics, which aren't powers of two, will end up in the wrong pitch class bin.
The algorithm referenced only works for a restricted class of waveforms, which are likely not representative of live music audio.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.