简体   繁体   中英

Python Code for Pitch Class Profiling

I'm working on a the Pitch Class Profile as proposed by Takuya Fujishima. I've done my best to implement this equation (using scipy and numpy); however, I've been getting some rather odd results. I've debated on putting this on DSP, but figured this is more of a coding problem than an equation understanding problem.

In any case, here is my code.

import scipy.io.wavfile
import numpy as np
import math
import sys

class PCP:

    def __init__(self):
        self.note_references = [16.35, 17.32, 18.35, 19.45, 20.60, 21.83, 23.12, 24.50, 25.96, 27.50, 29.14, 30.87]
        self.results = {}


    def create_fft(self, filename):
        self.rate, self.data = scipy.io.wavfile.read('fmin.wav')
        print "Data from the File: \n", self.data

        self.frames = self.data.size
        print "Number of Frames: ", self.frames

        print "Rate: ", self.rate

        self.fft_results = np.fft.rfft(self.data) ##fft computing and normalization
        print "Results from the FFT: \n", self.fft_results


    # The work of the following classes was almost entirely based on a
    # thread in DSP.  Here is the link to the particular article
    # http://dsp.stackexchange.com/questions/13722/pitch-class-profiling
    # This function returns the values of the notes given the spectrograph
    def m_func(self, l, p):
        #M(l) = round(12 * log_2( (f_s*l)/(N*f_ref) ) ) % 12
        #print "L: ", l
        #print "Note: ", p
        a = self.rate * l
        b = self.frames * self.note_references[p]
        c = 12 * np.log2(a/b)
        d = np.round(c)
        e = np.mod(d.all(), 12)
        #print "Result: ", e
        #raw_input()
        return e


    def pcp(self, p):
        r = 0
        for l in self.fft_results:
            result = self.m_func(l[0], p)
            #print "actual returned result", result
            if result == p:
                r+=1
                #print "There was a match!  Add it!"
        return r


    def calculate_PCP(self):
        for p in range(0,11): #for all 12 notes
            self.results[p] = self.pcp(p)


    def print_results(self):
        for i in self.results.keys():
            print i , ":" , self.results[i]


def main():
    m = PCP()
    m.create_fft("fmin.wav")
    m.calculate_PCP()
    m.print_results()


if __name__ == '__main__':
    main()

Here are the outputs:

Data from the File: 
[[16 15]
 [ 9  9]
 [15 15]
 ..., 
 [ 0  0]
 [ 0  0]
 [ 0  0]]
Number of Frames:  352800
Rate:  44100
Results from the FFT: 
[[ 31.+0.j   1.+0.j]
 [ 18.+0.j   0.+0.j]
 [ 30.+0.j   0.+0.j]
 ..., 
 [  0.+0.j   0.+0.j]
 [  0.+0.j   0.+0.j]
 [  0.+0.j   0.+0.j]]
PCP.py:36: RuntimeWarning: divide by zero encountered in log2
  c = 12 * np.log2(a/b)
PCP.py:36: RuntimeWarning: invalid value encountered in cdouble_scalars
  c = 12 * np.log2(a/b)
0 : 143
1 : 176263
2 : 0
3 : 0
4 : 0
5 : 0
6 : 0
7 : 0
8 : 0
9 : 0
10 : 0

The file contains a piano playing an F-minor chord (responding with 0, 5, and 7 in the results dictionary). The results, however, indicate a very strong presence for C#/Db, and I can certainly verify there is no C# in the recording. I'd appreciate any and all help!

Pitch frequency is different from spectral frequency, and thus not equal to the content of every 12th fft magnitude result bin (especially for recordings of actual musical sounds). If nothing else, any strong odd harmonics, which aren't powers of two, will end up in the wrong pitch class bin.

The algorithm referenced only works for a restricted class of waveforms, which are likely not representative of live music audio.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM