
[英]Calculating cosine similarity: ValueError: Input must be 1- or 2-d
希望大家都好。 我正在尝试使用以下方法有效地计算由 HashingVectorizing (Sklearn) 我的数据集创建的 (29805, 40) 稀疏矩阵的余弦相似度。 下面的方法最初来自@Waylon Flinn 对这个问题的回答。 当我尝试使用虚拟矩阵时,一切正常。 但是当我尝试使用自己 ...
[英]Python: ValueError: Input must be 1- or 2-d
提示:本站为国内最大中英文翻译问答网站,提供中英文对照查看,鼠标放在中文字句上可显示英文原文。
我有这段代码使用 Python 中的 tobit 回归估计 model。这是分三部分解析的代码:数据定义、估计器生成器和估计。
import numpy as np
from scipy.optimize import minimize
# define the dependent variable and independent variables
X = data.iloc[:, 1:]
y = data.iloc[:, 0]
# Add a column of ones to the independent variables for the constant term
X = np.c_[np.ones(X.shape[0]), X]
# Define the likelihood function for the Tobit model
def likelihood(params, y, X, lower, upper):
beta = params[:-1]
sigma = params[-1]
mu = X @ beta
prob = (1 / (sigma * np.sqrt(2 * np.pi)) * np.exp(-0.5 * ((y - mu) / sigma)**2))
prob[y < lower] = 0
prob[y > upper] = 0
return -np.log(prob).sum()
# Set the initial values for the parameters and the lower and upper bounds for censoring
params_init = np.random.normal(size=X.shape[1] + 1)
bounds = [(None, None) for i in range(X.shape[1])] + [(1e-10, None)]
# Perform the MLE estimation
res = minimize(likelihood, params_init, args=(y, X, 0, 100), bounds=bounds, method='L-BFGS-B')
# Extract the estimated parameters and their standard errors
params = res.x
stderr = np.sqrt(np.diag(res.hess_inv))
# Print the results
print(f'Coefficients: {params[:-1]}')
print(f'Standard Errors: {stderr[:-1]}')
print(f'Sigma: {params[-1]:.4f}')
为什么我会收到此错误消息? 谢谢你。
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-245-5f39f416cc07> in <module>
31 # Extract the estimated parameters and their standard errors
32 params = res.x
---> 33 stderr = np.sqrt(np.diag(res.hess_inv))
34
35 # Print the results
/opt/anaconda3/lib/python3.8/site-packages/numpy/core/overrides.py in diag(*args, **kwargs)
/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/twodim_base.py in diag(v, k)
307 return diagonal(v, k)
308 else:
--> 309 raise ValueError("Input must be 1- or 2-d.")
310
311
ValueError: Input must be 1- or 2-d.
编辑:如果您想查看我正在处理的数据类型,您可以使用我刚刚编写的这些代码行来模拟它们:
data = pd.DataFrame()
# Append 'interview probabilities' for individuals with and without disabilities
interview_prob_disabled = np.random.normal(38.63, 28.72, 619)
interview_prob_enabled = np.random.normal(44.27, 28.19, 542)
interview_prob = np.append(interview_prob_disabled, interview_prob_enabled)
# Correct the variable by its mean and standard deviation, without it being negative, nor exceeding 100, nor a float
interview_prob = np.clip(interview_prob, 0, 100)
interview_prob = np.round(interview_prob)
# Add the 'interview probabilities' variable to the dataframe
data['Interview Probabilities'] = interview_prob
# Add other variables such as age, gender, employment status, education, etc.
data['Age'] = np.random.randint(18, 65, size=len(interview_prob))
data['Gender'] = np.random.choice(['Male', 'Female'], size=len(interview_prob))
data['Employment Status'] = np.random.choice(['Employed', 'Unemployed', 'Retired'], size=len(interview_prob))
data['Education Level'] = np.random.choice(['High School', 'College', 'Vocational', 'Graduate School'], size=len(interview_prob))
# Add a 'disability status' variable as a dummy
data['Disability Status'] = np.append(np.repeat('Disabled', 619), np.repeat('Non-disabled', 542))
# Categorical variables
data['Gender'] = data['Gender'].map({'Male': 0, 'Female': 1})
data['Employment Status'] = data['Employment Status'].map({'Employed': 0, 'Unemployed': 1})
data['Education Level'] = data['Education Level'].map({'High School': 0, 'College': 1, 'Vocational': 2, 'Graduate School': 3})
data['Disability Status'] = data['Disability Status'].map({'Disabled': 1, 'Non-disabled': 0})
# Print the df
data
问题是您的求解器L-BFGS-B
从 .hess_inv 中产生LbfgsInvHessProduct
object(线性运算符)而不是.hess_inv
数组(类似于BFGS
的数组)。
解决您的问题的一种方法是改用res.hess_inv.todense()
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.