[英]make correlation plot on time series data in python
I want to see a correlation on a rolling week basis in time series data.我想在时间序列数据中看到滚动周的相关性。 The reason because I want to see how rolling correlation moves each year.
原因是我想看看滚动相关性每年如何变化。 To do so, I tried to use
pandas.corr()
, pandas.rolling_corr()
built-in function for getting rolling correlation and tried to make line plot, but I couldn't correct the correlation line chart. To do so, I tried to use
pandas.corr()
, pandas.rolling_corr()
built-in function for getting rolling correlation and tried to make line plot, but I couldn't correct the correlation line chart. I don't know how should I aggregate time series for getting rolling correlation line chart.我不知道我应该如何聚合时间序列以获得滚动相关折线图。 Does anyone knows any way of doing this in python?
有谁知道在 python 中这样做的任何方式? Is there any workaround to get rolling correlation line chart from time series data in pandas?
是否有任何解决方法可以从 pandas 中的时间序列数据中获取滚动相关折线图? any idea?
任何想法?
my attempt :我的尝试:
I tried of using pandas.corr()
to get correlation but it was not helpful to generate rolling correlation line chart.我尝试使用
pandas.corr()
来获得相关性,但生成滚动相关性折线图没有帮助。 So, here is my new attempt but it is not working.所以,这是我的新尝试,但它不起作用。 I assume I should think about the right way of data aggregation to make rolling correlation line chart.
我想我应该考虑正确的数据聚合方式来制作滚动相关折线图。
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
url = 'https://gist.githubusercontent.com/adamFlyn/eb784c86c44fd7ed3f2504157a33dc23/raw/79b6aa4f2e0ffd1eb626dffdcb609eb2cb8dae48/corr.csv'
df = pd.read_csv(url)
df['date'] = pd.to_datetime(df['date'])
def get_corr(df, window=4):
dfs = []
for key, value in df:
value["ROLL_CORR"] = pd.rolling_corr(value["prod_A_price"],value["prod_B_price"], window)
dfs.append(value)
df_final = pd.concat(dfs)
return df_final
corr_df = get_corr(df, window=12)
fig, ax = plt.subplots(figsize=(7, 4), dpi=144)
sns.lineplot(x='week', y='ROLL_CORR', hue='year', data=corr_df,alpha=.8)
plt.show()
plt.close()
doing this way is not working to me.这样做对我不起作用。 By doing this, I want to see how the rolling correlations move each year.
通过这样做,我想看看滚动相关性每年如何变化。 Can anyone point me out possible of doing rolling correlation line chart from time-series data in python?
谁能指出我可以从 python 中的时间序列数据中做滚动相关折线图吗? any thoughts?
有什么想法吗?
desired output所需 output
here is the desired rolling correlation line chart that I want to get.这是我想要获得的所需滚动相关折线图。 Note that desired plot was generated from MS excel.
请注意,所需的 plot 是从 MS excel 生成的。 I am wondering is there any possible way of doing this in python?
我想知道在 python 中是否有任何可能的方法? Is there any workaround to get a rolling correlation line chart from time-series data in python?
是否有任何解决方法可以从 python 中的时间序列数据中获取滚动相关折线图? how should I correct my current attempt to get the desired output?
我应该如何纠正我当前的尝试以获得所需的 output? any thoughts?
有什么想法吗?
Using your code and description as a starting point.使用您的代码和描述作为起点。 Panda's
Rolling
class has an apply
function which can be leveraged ( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.rolling.Rolling.apply.html#pandas.core.window.rolling.Rolling.apply ) Panda's
Rolling
class has an apply
function which can be leveraged ( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.rolling.Rolling.apply.html#pandas.core. window.rolling.Rolling.apply )
Two tricks are involved to make the code work:使代码工作涉及两个技巧:
rolling
function on a pandas.Series
(here df['week']
) to avoid going the applied function once per columnpandas.Series
(此处为df['week']
)上调用rolling
function 以避免每列应用一次 functionimport pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
url = 'https://gist.githubusercontent.com/adamFlyn/eb784c86c44fd7ed3f2504157a33dc23/raw/79b6aa4f2e0ffd1eb626dffdcb609eb2cb8dae48/corr.csv'
df = pd.read_csv(url)
def get_corr(ser):
rolling_df = df.loc[ser.index]
return rolling_df['prod_A_price'].corr(rolling_df['prod_B_price'])
df['ROLL_CORR'] = df['week'].rolling(4).apply(get_corr)
number_years = 3
for week, df_week in df.groupby('week'):
df = df.append({
'week': week,
'year': f'{number_years} year avg',
'ROLL_CORR': df_week.sort_values(by='date').head(number_years)['ROLL_CORR'].mean()
}, ignore_index=True)
fig, ax = plt.subplots(figsize=(7, 4), dpi=144)
sns.lineplot(x='week', y='ROLL_CORR', hue='year', data=df,alpha=.8)
plt.show()
plt.close()
You'll find here the generated image by seaborn
您将在此处找到
seaborn
生成的图像
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.