![](/img/trans.png)
[英]How do I store mutidimensional arrays contained in a dictionary in a python xarray?
[英]how to do regression in python via xarray?
我正在尝试分别对我的时间序列数据X和Y进行逐日回归,以当前日期的Y值回归先前日期的X数据。 X是具有维日期,库存和因子的3-D数据数组,Y是具有维日期和库存的2-D数据数组。 有人可以告诉我如何有效地做到这一点吗?
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import xarray as xr
import os
import warnings
from functools import reduce
import math as mt
import statsmodels.api as sm
from lib.gftTools import gftIO
import datetime
import logging
time = pd.date_range('2000-01-01', freq='D', periods=365)
X = xr.DataArray(
np.random.randn(365, 10, 3), [('date', time), ('stock', list('abcdefghij')),
('factor', list('xyz'))])
Y = xr.DataArray(
np.random.randn(365, 10), [('date', time), ('stock', list('abcdefghij'))])
# create regression result dateframe
params = pd.DataFrame(index=X.date, columns=X.factor)
residuals = pd.DataFrame(index=X.date, columns=X.symbol)
# get the datetimeindex
idx_date = y.get_index('date')
idx_symbol = X.get_index('symbol')
for dt in y.date.values:
logger.debug('regression on %s', dt)
cur_date = pd.Timestamp(dt)
# get the position of current date
dt_pos = idx_date.get_loc(cur_date)
if dt_pos == 0:
continue
dt_pre_pos = dt_pos - 1
# symbols having valid value(not nan)
s = X[:, dt_pre_pos].notnull().all(axis=0)
valid_x = X[:, dt_pre_pos, s].symbol.values
w = y.loc[cur_date].notnull()
valid_y = y.loc[cur_date, w].symbol.values
valid_symbol = np.intersect1d(valid_x, valid_y)
try:
model = sm.RLM(
y.loc[cur_date, valid_symbol].values,
X.isel(
date=dt_pre_pos,
symbol=idx_symbol.get_indexer(valid_symbol)).values.T,
M=sm.robust.norms.HuberT())
results = model.fit()
except ValueError:
continue
params.loc[cur_date] = results.params
residuals.loc[cur_date, valid_symbol] = results.resid
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.