简体   繁体   中英

The normalized cross-correlation of two signals in python

I wanted to calculate the normalized cross-correlation function of two signals where "x" axes is the time delay and "y" axes is value of correlation between -1 and 1 . so I decided to use scipy.

I use the command corr = signal.correlate(s1['Strain'], s2['Strain'], mode='full')

where s1['Strain'] and s2['Strain'] are the pandas dataframe values but it doesn't return the normalized function with "x" axes as time delay. Here is example data

s1:

            Strain
0        -1.587702e-22
1        -1.425868e-22
2        -1.174897e-22
3        -8.559119e-23
4        -4.949480e-23
.             .
.             .
.             .

for s2 it looks similar. I knew the sampling of both datasets, it's 4096 kHz.

Thank for your help.

First of all to get normalized coefficient (such that as lag 0, we get the Pearson correlation):

  • divide both signals by their standard deviation
  • scale by the length of the signal over which the convolution is done (shortest signal)
out = correlate(x/np.std(x), y/np.std(y), 'full') / min(len(x), len(y))

Now for the lags, from the official documentation of correlate one can read that the full output of cross-correlation is given by:

z[k] = (x * y)(k - N + 1)
     = \sum_{l=0}^{||x||-1}x_l y_{l-k+N-1}^{*}\]

Where * denotes the convolution, and k goes from 0 up to ||x|| + ||y|| - 2 ||x|| + ||y|| - 2 ||x|| + ||y|| - 2 precisely. N is max(len(x), len(y)) .

The lags are denoted above as the argument of the convolution (x * y) , so they range from 0 - N + 1 to ||x|| + ||y|| - 2 - N + 1 ||x|| + ||y|| - 2 - N + 1 ||x|| + ||y|| - 2 - N + 1 which is n - 1 with n=min(len(x), len(y)) .

Also, by briefly looking at the source code, I think they swap x and y sometimes if convenient... (hence the min(len(x), len(y)) in the normalisation above. However this implies to change the start of our lags, therefore:

N = max(len(x), len(y))
n = min(len(x), len(y))

# if len(x) < (len(y):
lags = np.arange(-N + 1, n)

# else:
lags = np.arange(-n + 1, N)

Summary

Check this code on two time-series for which you want to plot the cross-correlation of:

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import correlate

def plot_xcorr(x, y): 
    "Plot cross-correlation (full) between two signals."
    N = max(len(x), len(y)) 
    n = min(len(x), len(y)) 

    if N == len(y): 
        lags = np.arange(-N + 1, n) 
    else: 
        lags = np.arange(-n + 1, N) 
    c = correlate(x / np.std(x), y / np.std(y), 'full') 

    plt.plot(lags, c / n) 
    plt.show() 

To calculate the time delay between two signals, we need to find the cross-correlation between two signals and find the argmax.

Assuming data_1 and data_2 are samples of two signals:

import numpy as np 
import pandas as pd

correlation = np.correlate(data_1, data_2, mode='same')
delay = np.argmax(correlation) - int(len(correlation)/2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM