简体   繁体   中英

Output of a statsmodels regression

I would like to perform a simple linear regression using statsmodels and I've tried several different methods by now but I just don't get it to work. The code that I have constructed now doesn't give me any errors but it also doesn't show me the result

I am trying to create a model for the variable "Direction" which takes the value 0 if the return for the corresponding date was negative and 1 if it was positive. The explinatory variables are the (5) lags of the returns. The df13 contains the lags and also the direction for each observed date. I tried this code and as I mentioned it doesn't give an error but says " Optimization terminated successfully. Current function value: 0.682314 Iterations 5

However, I would like to see the typical table with all the beta values, their significance etc.

Also, what would you say, since Direction is a binary variable may it be better to use a logit instead of a linear model? However, in the assignment it appeared as a linear model.

And lastly, I am sorry its not displayed here correctly but I don't know how to write as code or insert my dataframe

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

import os

import itertools

from sklearn import preprocessing

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

import statsmodels.api as sm

import matplotlib.pyplot as plt

from statsmodels.sandbox.regression.predstd import wls_prediction_std

...



X = df13[['Lag1', 'Lag2', 'Lag3', 'Lag4', 'Lag5']]
Y = df13['Direction']

X = sm.add_constant(X)


model = sm.Logit(Y.astype(float), X.astype(float)).fit()
predictions = model.predict(X)

print_model = model.summary
print(print_model)

Edit: I'm sure it has to be a logit regression so I updated that part

I don't know if this is unintentional, but it looks like you need to define X and Y separately:

X = df13[['Lag1', 'Lag2', 'Lag3', 'Lag4', 'Lag5']]

Y = df13['Direction']

Secondly, I'm not familiar with statsmodel, but I would try converting your dataframes to numpy arrays. You can do this with

Xnum = X.to_numpy() 

ynum = y.to_numpy() 

And try passing those to the regressors.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM