簡體   English   中英

無法將數據擬合到 HMM-Learn 模型 (Python3.9)

[英]Having trouble fitting data to HMM-Learn model (Python3.9)

我正在嘗試對來自 S&P500 的一些股票數據進行隱馬爾可夫模型建模。

這些數據是從雅虎財經下載的,並包含在一個包含 250 個交易日數據的 CSV 文件中。 一周前我讓這段代碼工作,但現在它似乎不起作用。

import pandas as pd
from hmmlearn import hmm
import numpy as np
from matplotlib import cm, pyplot as plt
from matplotlib.dates import YearLocator, MonthLocator

df = pd.read_csv( "SnP500_1Yhist.csv",
                   header      = 0,
                   index_col   = "Date",
                   parse_dates = True
                   )
df["Returns"] = df["Adj Close"].pct_change()
df.dropna( inplace = True )

hmm_model = hmm.GaussianHMM( n_components    =   4,
                             covariance_type =   "full",
                             n_iter          = 100
                             )               # %Create the model
df = df["Returns"]                           # %Extract the wanted column of data
training_set = np.column_stack( df )         # %Shape = [1,250]

hmm_model.fit( training_set )                # %This is where I get the error

我得到的錯誤是:

ValueError                                Traceback (most recent call last)
<ipython-input-51-c8f66806fad6> in <module>
      9 print(training_set.shape)
     10 print(training_set)
---> 11 hmm_model.fit(training_set)

~/Git Projects/Aiguille Systems/allocationmodel/macromodelv2_venv/lib/python3.9/site-packages/hmmlearn/base.py in fit(self, X, lengths)
    460         """
    461         X = check_array(X)
--> 462         self._init(X, lengths=lengths)
    463         self._check()
    464 

~/Git Projects/Aiguille Systems/allocationmodel/macromodelv2_venv/lib/python3.9/site-packages/hmmlearn/hmm.py in _init(self, X, lengths)
    205             kmeans = cluster.KMeans(n_clusters=self.n_components,
    206                                     random_state=self.random_state)
--> 207             kmeans.fit(X)
    208             self.means_ = kmeans.cluster_centers_
    209         if self._needs_init("c", "covars_"):

~/Git Projects/Aiguille Systems/allocationmodel/macromodelv2_venv/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py in fit(self, X, y, sample_weight)
   1033                                 accept_large_sparse=False)
   1034 
-> 1035         self._check_params(X)
   1036         random_state = check_random_state(self.random_state)
   1037 

~/Git Projects/Aiguille Systems/allocationmodel/macromodelv2_venv/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py in _check_params(self, X)
    956         # n_clusters
    957         if X.shape[0] < self.n_clusters:
--> 958             raise ValueError(f"n_samples={X.shape[0]} should be >= "
    959                              f"n_clusters={self.n_clusters}.")
    960 

ValueError: n_samples=1 should be >= n_clusters=4.

“……它似乎不起作用。”

好,
確實如此。 如果您在調用.fit()方法之前測試您的實際training_set .fit() ,我們無法在此處重現,您將得到報告錯誤的直接原因:

N_COMPONENTS = 4
ERR_MASK     = ( "ERR: training_set was smaller than the N_COMPONENTS == {0:}"
               + "requested,\n"
               + "     whereas the actual shape[0] was {1:}"
                  )
...

hmm_model = hmm.GaussianHMM( n_components    =   N_COMPONENTS,
                             covariance_type =   "full",
                             n_iter          = 100
                             )
...

( hmm_model.fit( training_set )    if training_set.shape[0] >= N_COMPONENTS
                                 else print( ERR_MASK.format(  N_COMPONENTS,
                                                               training_set.shape[0]
                                                               )
                                             )
  )
~/Git Projects/Aiguille Systems/allocationmodel/macromodelv2_venv/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py in _check_params(self, X)
    956         # n_clusters
    957         if X.shape[0] < self.n_clusters:
--> 958             raise ValueError(f"n_samples={X.shape[0]} should be >= "
    959                              f"n_clusters={self.n_clusters}.")
--------------------------------------------------X.shape[0]------------
--------------------------------------------------X.shape[0]------------

ValueError: n_samples=1 should be >= n_clusters=4.

fit( X, lengths = None )

    Estimate model parameters.

    An initialization step is performed before entering the EM algorithm.
       If you want to avoid this step for a subset of the parameters,
       pass proper init_params keyword argument to estimator’s constructor.

    Parameters

            X ( array-like, shape ( n_samples, n_features ) )
              – Feature matrix of individual samples.

            lengths ( array-like of integers, shape ( n_sequences, ) )
              – Lengths of the individual sequences in X.
                The sum of these should be n_samples.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM