Calculate FFT from SampleGrabber using DirectShow .NET

Question

I'm developing a project using DirectShow .NET. I'm trying to integrate a library called "WPF Sound Visualization Library" which creates a spectrum analyzer visual.

In order for the visual to work I need to implement these 2 methods in my player:

GetFFTData(float[] fftDataBuffer) - Assigns current FFT data to a buffer.
Remarks: The FFT data in the buffer should consist only of the real number intensity values. This means that if your FFT algorithm returns complex numbers (as many do), you'd run an algorithm similar to: for(int i = 0; i < complexNumbers.Length / 2; i++) fftResult[i] = Math.Sqrt(complexNumbers[i].Real * complexNumbers[i].Real + complexNumbers[i].Imaginary * complexNumbers[i].Imaginary);
GetFFTFrequencyIndex(int frequency) - Gets the index in the FFT data buffer for a given frequency.

Edit: I already added the SampleGrabber and integrated it's callback with the GetFFTData (which is still not tested). But how do integrate the GetFFTFrequencyIndex method?

    protected int SampleCB(double SampleTime, IMediaSample pSample)
    {
        IntPtr pointer = IntPtr.Zero;
        pSample.GetPointer(out pointer);
        sampleDataBytes = new byte[pSample.GetSize()];
        Marshal.Copy(pointer, sampleDataBytes, 0, sampleDataBytes.Length);
        var sampleTime = SampleTime;
        var actualDataLength = pSample.GetActualDataLength();

        /* Release unmanaged resources */
        Marshal.ReleaseComObject(pSample);
        pSample = null;

        return (int)HResults.S_OK;
    }

    #region ISpectrumPlayer
    byte[] sampleDataBytes = null;

    public bool GetFFTData(float[] fftDataBuffer)
    {
        if (sampleDataBytes != null)
        {
            var sampleData = Utils.GetInt16Array(sampleDataBytes);
            double[] pRealIn = new double[sampleData.Length];

            for (var i = 0; i <= sampleData.Length - 1; i++)
                pRealIn[i] = sampleData[i];

            var pImagIn = new double[sampleDataBytes.Length];
            var pRealOut = new double[sampleDataBytes.Length];
            var pImagOut = new double[sampleDataBytes.Length];

            FFTUtils.Compute((uint) pRealIn.Length, pRealIn, pImagIn, pRealOut, pImagOut, false);

            fftDataBuffer = new float[sampleDataBytes.Length];

            for (int i = 0; i < pRealOut.Length; i++)
                fftDataBuffer[i] = (float) Math.Sqrt(pRealOut[i] * pRealOut[i] + pImagOut[i] * pImagOut[i]);
        }

        return true;
    }

    public int GetFFTFrequencyIndex(int frequency)
    {
        throw new NotImplementedException();
    }
    #endregion

I ודקג this class with methods that can help:

public class FFTUtils
{
    public const Double DDC_PI = 3.14159265358979323846;

    /// <summary>
    /// Verifies a number is a power of two
    /// </summary>
    /// <param name="x">Number to check</param>
    /// <returns>true if number is a power two (i.e.:1,2,4,8,16,...)</returns>
    public static Boolean IsPowerOfTwo(UInt32 x)
    {
        return ((x != 0) && (x & (x - 1)) == 0);
    }

    /// <summary>
    /// Get Next power of number.
    /// </summary>
    /// <param name="x">Number to check</param>
    /// <returns>A power of two number</returns>
    public static UInt32 NextPowerOfTwo(UInt32 x)
    {
        x = x - 1;
        x = x | (x >> 1);
        x = x | (x >> 2);
        x = x | (x >> 4);
        x = x | (x >> 8);
        x = x | (x >> 16);
        return x + 1;
    }

    /// <summary>
    /// Get Number of bits needed for a power of two
    /// </summary>
    /// <param name="PowerOfTwo">Power of two number</param>
    /// <returns>Number of bits</returns>
    public static UInt32 NumberOfBitsNeeded(UInt32 PowerOfTwo)
    {
        if (PowerOfTwo > 0)
        {
            for (UInt32 i = 0, mask = 1; ; i++, mask <<= 1)
            {
                if ((PowerOfTwo & mask) != 0)
                    return i;
            }
        }
        return 0; // error
    }

    /// <summary>
    /// Reverse bits
    /// </summary>
    /// <param name="index">Bits</param>
    /// <param name="NumBits">Number of bits to reverse</param>
    /// <returns>Reverse Bits</returns>
    public static UInt32 ReverseBits(UInt32 index, UInt32 NumBits)
    {
        UInt32 i, rev;

        for (i = rev = 0; i < NumBits; i++)
        {
            rev = (rev << 1) | (index & 1);
            index >>= 1;
        }

        return rev;
    }

    /// <summary>
    /// Return index to frequency based on number of samples
    /// </summary>
    /// <param name="Index">sample index</param>
    /// <param name="NumSamples">number of samples</param>
    /// <returns>Frequency index range</returns>
    public static Double IndexToFrequency(UInt32 Index, UInt32 NumSamples)
    {
        if (Index >= NumSamples)
            return 0.0;
        else if (Index <= NumSamples / 2)
            return (double)Index / (double)NumSamples;

        return -(double)(NumSamples - Index) / (double)NumSamples;
    }

    /// <summary>
    /// Compute FFT
    /// </summary>
    /// <param name="NumSamples">NumSamples Number of samples (must be power two)</param>
    /// <param name="pRealIn">Real samples</param>
    /// <param name="pImagIn">Imaginary (optional, may be null)</param>
    /// <param name="pRealOut">Real coefficient output</param>
    /// <param name="pImagOut">Imaginary coefficient output</param>
    /// <param name="bInverseTransform">bInverseTransform when true, compute Inverse FFT</param>
    public static void Compute(UInt32 NumSamples, Double[] pRealIn, Double[] pImagIn,
                        Double[] pRealOut, Double[] pImagOut, Boolean bInverseTransform)
    {
        UInt32 NumBits;    /* Number of bits needed to store indices */
        UInt32 i, j, k, n;
        UInt32 BlockSize, BlockEnd;

        double angle_numerator = 2.0 * DDC_PI;
        double tr, ti;     /* temp real, temp imaginary */

        if (pRealIn == null || pRealOut == null || pImagOut == null)
        {
            // error
            throw new ArgumentNullException("Null argument");
        }
        if (!IsPowerOfTwo(NumSamples))
        {
            // error
            throw new ArgumentException("Number of samples must be power of 2");
        }
        if (pRealIn.Length < NumSamples || (pImagIn != null && pImagIn.Length < NumSamples) ||
             pRealOut.Length < NumSamples || pImagOut.Length < NumSamples)
        {
            // error
            throw new ArgumentException("Invalid Array argument detected");
        }

        if (bInverseTransform)
            angle_numerator = -angle_numerator;

        NumBits = NumberOfBitsNeeded(NumSamples);

        /*
        **   Do simultaneous data copy and bit-reversal ordering into outputs...
        */
        for (i = 0; i < NumSamples; i++)
        {
            j = ReverseBits(i, NumBits);
            pRealOut[j] = pRealIn[i];
            pImagOut[j] = (double)((pImagIn == null) ? 0.0 : pImagIn[i]);
        }

        /*
        **   Do the FFT itself...
        */
        BlockEnd = 1;
        for (BlockSize = 2; BlockSize <= NumSamples; BlockSize <<= 1)
        {
            double delta_angle = angle_numerator / (double)BlockSize;
            double sm2 = Math.Sin(-2 * delta_angle);
            double sm1 = Math.Sin(-delta_angle);
            double cm2 = Math.Cos(-2 * delta_angle);
            double cm1 = Math.Cos(-delta_angle);
            double w = 2 * cm1;
            double ar0, ar1, ar2;
            double ai0, ai1, ai2;

            for (i = 0; i < NumSamples; i += BlockSize)
            {
                ar2 = cm2;
                ar1 = cm1;

                ai2 = sm2;
                ai1 = sm1;

                for (j = i, n = 0; n < BlockEnd; j++, n++)
                {
                    ar0 = w * ar1 - ar2;
                    ar2 = ar1;
                    ar1 = ar0;

                    ai0 = w * ai1 - ai2;
                    ai2 = ai1;
                    ai1 = ai0;

                    k = j + BlockEnd;
                    tr = ar0 * pRealOut[k] - ai0 * pImagOut[k];
                    ti = ar0 * pImagOut[k] + ai0 * pRealOut[k];

                    pRealOut[k] = (pRealOut[j] - tr);
                    pImagOut[k] = (pImagOut[j] - ti);

                    pRealOut[j] += (tr);
                    pImagOut[j] += (ti);
                }
            }

            BlockEnd = BlockSize;
        }

        /*
        **   Need to normalize if inverse transform...
        */
        if (bInverseTransform)
        {
            double denom = (double)(NumSamples);

            for (i = 0; i < NumSamples; i++)
            {
                pRealOut[i] /= denom;
                pImagOut[i] /= denom;
            }
        }
    }

    /// <summary>
    /// Calculate normal (power spectrum)
    /// </summary>
    /// <param name="NumSamples">Number of sample</param>
    /// <param name="pReal">Real coefficient buffer</param>
    /// <param name="pImag">Imaginary coefficient buffer</param>
    /// <param name="pAmpl">Working buffer to hold amplitude Xps(m) = | X(m)^2 | = Xreal(m)^2  + Ximag(m)^2</param>
    public static void Norm(UInt32 NumSamples, Double[] pReal, Double[] pImag, Double[] pAmpl)
    {
        if (pReal == null || pImag == null || pAmpl == null)
        {
            // error
            throw new ArgumentNullException("pReal,pImag,pAmpl");
        }
        if (pReal.Length < NumSamples || pImag.Length < NumSamples || pAmpl.Length < NumSamples)
        {
            // error
            throw new ArgumentException("Invalid Array argument detected");
        }

        // Calculate amplitude values in the buffer provided
        for (UInt32 i = 0; i < NumSamples; i++)
        {
            pAmpl[i] = pReal[i] * pReal[i] + pImag[i] * pImag[i];
        }
    }

    /// <summary>
    /// Find Peak frequency in Hz
    /// </summary>
    /// <param name="NumSamples">Number of samples</param>
    /// <param name="pAmpl">Current amplitude</param>
    /// <param name="samplingRate">Sampling rate in samples/second (Hz)</param>
    /// <param name="index">Frequency index</param>
    /// <returns>Peak frequency in Hz</returns>
    public static Double PeakFrequency(UInt32 NumSamples, Double[] pAmpl, Double samplingRate, ref UInt32 index)
    {
        UInt32 N = NumSamples >> 1;   // number of positive frequencies. (numSamples/2)

        if (pAmpl == null)
        {
            // error
            throw new ArgumentNullException("pAmpl");
        }
        if (pAmpl.Length < NumSamples)
        {
            // error
            throw new ArgumentException("Invalid Array argument detected");
        }

        double maxAmpl = -1.0;
        double peakFreq = -1.0;
        index = 0;

        for (UInt32 i = 0; i < N; i++)
        {
            if (pAmpl[i] > maxAmpl)
            {
                maxAmpl = (double)pAmpl[i];
                index = i;
                peakFreq = (double)(i);
            }
        }

        return samplingRate * peakFreq / (double)(NumSamples);
    }
}

Thank you so much!

Answer 1

If I remember well, the algorithm takes a real (eg int[n] ) signal or a complex one (eg int[n][2] ) and returns a complex FFT result.

So, it seems quite easy:

You take your input values (that you can plot as value-to-time in a chart, eg the left speaker audio values) and you feed them in the pRealIn parameter. In pImagIn you put zeros (as many as in pRealIn ). In bInverseTransform you put false (of course).

Then you will take the result back in pRealOut & pImagOut . The result buffers should logically be of the same size as the input buffers. You must take those two output buffers and combine them in pair like so (for each element of the OUT arrays):

fftDataBuffer[k] = Math.Sqrt(pRealOut[k] * pRealOut[k] + pImagOut[k] * pImagOut[k]); // Do this from 1 to n

FFT result is an array of complex values ( x = real part and y = imaginary part - you can depict it on a Cartesian system as a vector). You want the vector's size/amplitude that why you do the above.

That is for GetFFTData .

I see that you have a function called IndexToFrequency . So that may work. All you have to do is call this method for every index of your buffers. That is:

for(int i=0; i<n; i++) freq[i] = IndexToFrequency(i, n);

Keep those values stored and then in your GetFFTFrequencyIndex(int frequency) you find the closest match of the input parameter ( frequency ) to the elements in freq[n] and return its index.

I think that will be enough.

Important: Make sure that your buffers have a power-of-two size ( NextPowerOfTwo seems to be designed to help you do just that).

If your data happens to be smaller some times, you can pad the values with zeros at the end (aka append zeros to your input buffers). Also: To get a better resolution you can again pad with zeros your data. This will increase the 'smoothness' of your result which may be desirable.

Where did you find this code, if I may ask (just curious :) )?

So, that's all it! :)

Calculate FFT from SampleGrabber using DirectShow .NET

Question

1 answers

solution1
0 2014-08-02 17:28:17

Calculate FFT from SampleGrabber using DirectShow .NET

Question

1 answers

solution1 0 2014-08-02 17:28:17

solution1
0 2014-08-02 17:28:17