獲取沒有匹配指定簽名和轉換錯誤的循環

Question

我是 python 和機器學習的初學者。 當我嘗試將數據放入 statsmodels.formula.api OLS.fit() 時出現以下錯誤

回溯（最近一次調用最后一次）：

文件“”，第 47 行，在 regressor_OLS = sm.OLS(y , X_opt).fit()

文件“E:\\Anaconda\\lib\\site-packages\\statsmodels\\regression\\linear_model.py”，第 190 行，適合 self.pinv_wexog，singular_values = pinv_extended(self.wexog)

文件“E:\\Anaconda\\lib\\site-packages\\statsmodels\\tools\\tools.py”，第 342 行，在 pinv_extended u, s, vt = np.linalg.svd(X, 0)

文件“E:\\Anaconda\\lib\\site-packages\\numpy\\linalg\\linalg.py”，第 1404 行，在 svd u, s, vt = gufunc(a, signature=signature, extobj=extobj)

類型錯誤：未找到與 ufunc svd_n_s 匹配的指定簽名和轉換的循環

代碼

#Importing Libraries
import numpy as np # linear algebra
import pandas as pd # data processing
import matplotlib.pyplot as plt #Visualization


#Importing the dataset
dataset = pd.read_csv('Video_Games_Sales_as_at_22_Dec_2016.csv')
#dataset.head(10) 

#Encoding categorical data using panda get_dummies function . Easier and straight forward than OneHotEncoder in sklearn
#dataset = pd.get_dummies(data = dataset , columns=['Platform' , 'Genre' , 'Rating' ] , drop_first = True ) #drop_first use to fix dummy varible trap 


dataset=dataset.replace('tbd',np.nan)

#Separating Independent & Dependant Varibles
#X = pd.concat([dataset.iloc[:,[11,13]], dataset.iloc[:,13: ]] , axis=1).values  #Getting important  variables
X = dataset.iloc[:,[10,12]].values
y = dataset.iloc[:,9].values #Dependant Varible (Global sales)


#Taking care of missing data
from sklearn.preprocessing import Imputer
imputer =  Imputer(missing_values = 'NaN' , strategy = 'mean' , axis = 0)
imputer = imputer.fit(X[:,0:2])
X[:,0:2] = imputer.transform(X[:,0:2])


#Splitting the dataset into the Training set and Test set
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.2 , random_state = 0)

#Fitting Mutiple Linear Regression to the Training Set
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train,y_train)

#Predicting the Test set Result
y_pred = regressor.predict(X_test)


#Building the optimal model using Backward Elimination (p=0.050)
import statsmodels.formula.api as sm
X = np.append(arr = np.ones((16719,1)).astype(float) , values = X , axis = 1)

X_opt = X[:, [0,1,2]]
regressor_OLS = sm.OLS(y , X_opt).fit()
regressor_OLS.summary()

數據集

數據集鏈接

在 stack-overflow 或 google 上找不到任何有助於解決此問題的信息。

Answer 1

嘗試指定

dtype = '浮動'

創建矩陣時。 例子：

a=np.matrix([[1,2],[3,4]], dtype='float')

希望這有效！

Answer 2

如前所述，您需要確保 X_opt 是浮點類型。 例如，在您的代碼中，它看起來像這樣：

X_opt = X[:, [0,1,2]]
X_opt = X_opt.astype(float)
regressor_OLS = sm.OLS(endog=y, exog=X_opt).fit()
regressor_OLS.summary()

Answer 3

遇到了類似的問題。 解決了我提到的 dtype 和展平數組的問題。

numpy 版本：1.17.3

a = np.array(a, dtype=np.float)
a = a.flatten()

Answer 4

面臨類似的問題，我使用了df.values[]

y = df.values[:, 4]

通過使用df.iloc[].values函數修復了該問題。

y = dataset.iloc[:, 4].values

df.values[]函數返回對象數據類型

array([192261.83, 191792.06, 191050.39, 182901.99, 166187.94, 156991.12,
   156122.51, 155752.6, 152211.77, 149759.96, 146121.95, 144259.4,
   141585.52, 134307.35, 132602.65, 129917.04, 126992.93, 125370.37,
   124266.9, 122776.86, 118474.03, 111313.02, 110352.25, 108733.99,
   108552.04, 107404.34, 105733.54, 105008.31, 103282.38, 101004.64,
   99937.59, 97483.56, 97427.84, 96778.92, 96712.8, 96479.51,
   90708.19, 89949.14, 81229.06, 81005.76, 78239.91, 77798.83,
   71498.49, 69758.98, 65200.33, 64926.08, 49490.75, 42559.73,
   35673.41, 14681.4], dtype=object)

但

df.iloc[:, 4].values returns floats array

這是什么

regressor_OLS = sm.OLS(endog=y, exog=X_opt).fit()

OLS() 樂趣接受

或者

您可以在將 y 插入到有趣的 OLS() 之前更改它的數據類型

y = np.array(y, dtype = float)

Answer 5

從 NumPy 1.18.4 降級到 1.15.2 對我有用： pip install --upgrade numpy==1.15.2

獲取沒有匹配指定簽名和轉換錯誤的循環

問題描述

5 個解決方案

解決方案1
52 已采納 2018-05-07 02:25:57

解決方案2
3 2018-12-17 02:52:30

解決方案3
3 2020-06-15 17:55:12

解決方案4
2 2020-07-16 14:57:35

解決方案5
0 2020-05-31 19:12:58

獲取沒有匹配指定簽名和轉換錯誤的循環

問題描述

5 個解決方案

解決方案1 52 已采納 2018-05-07 02:25:57

解決方案2 3 2018-12-17 02:52:30

解決方案3 3 2020-06-15 17:55:12

解決方案4 2 2020-07-16 14:57:35

解決方案5 0 2020-05-31 19:12:58

解決方案1
52 已采納 2018-05-07 02:25:57

解決方案2
3 2018-12-17 02:52:30

解決方案3
3 2020-06-15 17:55:12

解決方案4
2 2020-07-16 14:57:35

解決方案5
0 2020-05-31 19:12:58