熊貓數據框與系列的相關性

Question

我有一個數據框和一系列數據，希望將滾動相關性作為新數據框返回。

因此，我在df1中有3列，我想返回一個新的數據幀，該數據幀是這些列中每個列與Series對象的滾動相關性。

import pandas as pd

df1 = pd.read_csv('https://bpaste.net/raw/d0456d3a020b')
df1['Date'] = pd.to_datetime(df1['Date'])
df1 = df1.set_index(df1['Date'])
del df1['Date']


df2 = pd.read_csv('https://bpaste.net/raw/d5cb455cb091')
df2['Date'] = pd.to_datetime(df2['Date'])
df2 = df2.set_index(df2['Date'])
del df2['Date']


pd.rolling_corr(df1, df2)

結果https://bpaste.net/show/58b59c656ce4僅給出NaN和1s

pd.rolling_corr(df1['IWM_Close'], spy, window=22)

給出了理想的返回序列，但我不想遍歷數據框的各列。 有更好的方法嗎？

謝謝。

Answer 1

我相信您的第二個輸入必須是一個Series才能與第一個DataFrame中的所有columns相關DataFrame 。

這有效：

index = pd.DatetimeIndex(start=date(2015,1,1), freq='W', periods = 100)
df1 = pd.DataFrame(np.random.random((100,3)), index=index)
df2 = pd.DataFrame(np.random.random((100,1)), index=index)
print(pd.rolling_corr(df1, df2.squeeze(), window=20).tail())

或者，為相同的結果：

df2 = pd.Series(np.random.random(100), index=index)
print(pd.rolling_corr(df1, df2, window=20).tail())

                   0         1         2
2016-10-30 -0.170971 -0.039929 -0.091098
2016-11-06 -0.199441  0.000093 -0.096331
2016-11-13 -0.213728 -0.020709 -0.129935
2016-11-20 -0.075859  0.014667 -0.153830
2016-11-27 -0.114041  0.019886 -0.155472

但這並不表示缺少的.squeeze()僅與匹配的columns相關：

print(pd.rolling_corr(df1, df2, window=20).tail())

                   0   1   2
2016-10-30  0.019865 NaN NaN
2016-11-06  0.087075 NaN NaN
2016-11-13  0.011679 NaN NaN
2016-11-20 -0.004155 NaN NaN
2016-11-27  0.111408 NaN NaN

熊貓數據框與系列的相關性

問題描述

1 個解決方案

解決方案1
1 已采納 2015-12-22 15:08:45

熊貓數據框與系列的相關性

問題描述

1 個解決方案

解決方案1 1 已采納 2015-12-22 15:08:45

解決方案1
1 已采納 2015-12-22 15:08:45