简体   繁体   中英

Python: How to weight data by time for a statsmodel HuberT linear regression?

I'm using statsmodel and this is the code I'm using to generate a multilinear regression:

def regression():
    Data = pd.read_csv("CSV_file")
    DependentVariable = Data[["Variable1"]].values.tolist()
    IndependentVariables = Data[["Variable2","Variable3","Variable4"]].values.tolist()

    huber_t = sm.RLM(DependentVariable, IndependentVariables, M=sm.robust.norms.HuberT())

    hub_results = huber_t.fit()
    return hub_results.summary()

This gives a normal output. However, I would also like to weight my data so that the more recent data is more significant than older data. I was thinking about using some sort of exponential decay to compute the weight. Is there any way to take this weighting into account when computing the linear regression?

There's an example of scaling with exponential decay on this page, but I'm not sure if the same technic will work for you (perhaps it works only in the context of plotting, but you can give a try for scaling your own variable) http://blog.yhat.com/posts/predicting-the-presidential-election.html

weight <- function(i) {
  exp(1)*1 / exp(i)
}

w <- data.frame(poll=1:8, weight=weight(1:8))
ggplot(w, aes(x=poll, weight=weight)) +
  geom_bar() +
  scale_x_continuous("nth poll", breaks=1:8) +
  scale_y_continuous("weight")

or perhaps you can generate an exponentially decaying series using numpy with the answer provided here:

Pandas: Exponentially decaying sum with variable weights

This kind of weights cannot currently be used.

see statsmodels -- weights in robust linear regression for a related answer.

Because HuberT is quadratic locally at small residuals, the rescaling of the data by the weights as in that answer can work as an approximation. However, it is not equivalent to adding weights to the contribution to the objective function by each observation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM