As shown in this question Calculating rolling correlation of pandas dataframes , I need to get a correlation of an array of length N to each window in a second array length M.
x= np.random.randint(0,100,10000)
y= [4,5,4,5]
corrs = []
for i in range(0,(len(x)-len(y) ) +1):
corrs.append( np.corrcoef(x[i:i+4],y)[0,1] )
Every question I find that is similar to this discusses how to do it on a matrix for NxK to MxK. However the ones I try are not working for 1d data. In the linked question, the suggest is to roll over the pandas frame, which is pretty slow. Is there a faster way to calculate this?
The above code takes around 0.4s and the code from the example link takes 1.6s:
corr = x.rolling(4).apply(lambda x: np.corrcoef(x,y)[0,1],raw=False ).dropna(how='all',axis=0)
Is there a much more efficient way to do this?
Store your correlation coefficients in a numpy array instead of a regular python list (you are resizing the list every time you insert an element)
corrs = np.zeros([len(x)-len(y)+1])
for i in range(0,(len(x)-len(y) ) +1):
corrs[i] = np.corrcoef(x[i:i+4],y)[0,1]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.