简体   繁体   中英

What am I doing wrong when using seasonal decompose in Python?

I have a small time series with monthly intervals. I wanted to plot it and then decompose into seasonality, trend, residuals. I start by importing csv into pandas and than plotting just the time series which works fine. I follow This tutorial and my code goes like this:

%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd

ali3 = pd.read_csv('C:\\Users\\ALI\\Desktop\\CSV\\index\\ZIAM\\ME\\ME_DATA_7_MONTH_AVG_PROFIT\\data.csv',
 names=['Date', 'Month','AverageProfit'],
 index_col=['Date'],
 parse_dates=True)

\* Delete month column which is a string */
del ali3['Month']


ali3
plt.plot(ali3)

Data Frame

At this stage I try to do the seasonal decompose like this:

import statsmodels.api as sm 
res = sm.tsa.seasonal_decompose(ali3.AverageProfit)  
fig = res.plot() 

which results in the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-afeab639d13b> in <module>()
      1 import statsmodels.api as sm
----> 2 res = sm.tsa.seasonal_decompose(ali3.AverageProfit)
      3 fig = res.plot()

C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\statsmodels\tsa\seasonal.py in seasonal_decompose(x, model, filt, freq)
     86             filt = np.repeat(1./freq, freq)
     87 
---> 88     trend = convolution_filter(x, filt)
     89 
     90     # nan pad for conformability - convolve doesn't do it

C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\statsmodels\tsa\filters\filtertools.py in convolution_filter(x, filt, nsides)
    287 
    288     if filt.ndim == 1 or min(filt.shape) == 1:
--> 289         result = signal.convolve(x, filt, mode='valid')
    290     elif filt.ndim == 2:
    291         nlags = filt.shape[0]

C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\signaltools.py in convolve(in1, in2, mode)
    468         return correlate(volume, kernel[slice_obj].conj(), mode)
    469     else:
--> 470         return correlate(volume, kernel[slice_obj], mode)
    471 
    472 

C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\signaltools.py in correlate(in1, in2, mode)
    158 
    159     if mode == 'valid':
--> 160         _check_valid_mode_shapes(in1.shape, in2.shape)
    161         # numpy is significantly faster for 1d
    162         if in1.ndim == 1 and in2.ndim == 1:

C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\signaltools.py in _check_valid_mode_shapes(shape1, shape2)
     70         if not d1 >= d2:
     71             raise ValueError(
---> 72                 "in1 should have at least as many items as in2 in "
     73                 "every dimension for 'valid' mode.")
     74 

ValueError: in1 should have at least as many items as in2 in every dimension for 'valid' mode.

Can anyone shed some light on what I'm doing wrong and how may I fix it? much obliged.

Edit: Thats how the data frame looks like

Date            AverageProfit

2015-06-01          29.990231
2015-07-01          26.080038
2015-08-01          25.640862
2015-09-01          25.346447
2015-10-01          27.386001
2015-11-01          26.357709
2015-12-01          25.260644

You have 7 data points, that is usually a very small number for performing stationarity analysis.

You don't have enough points to use seasonal decomposition. To see this, you can concatenate your data to create an extended time series (just repeating your data for the following months). Let extendedData be this extended dataframe and data your original data.

data.plot()

在此处输入图片说明

extendedData.plot()

在此处输入图片说明

res = sm.tsa.seasonal_decompose(extendedData.interpolate())
res.plot()

在此处输入图片说明

The frequency ( freq ) for the seasonal estimate is automatically estimated form the data, and can be manually specified.


You can try to take a first difference: generate a new time series subtracting each data value from the previous one. In your case it looks like this:

在此处输入图片说明

An stationarity test can be applied next, as explained here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM