python线性回归的置信度小于0

Question

My have the share prices df2[x] below as Y: 我的以下股价df2 [x]为Y：

2018-09-05    6.22
2018-09-06    6.19
2018-09-07    6.22
2018-09-10    6.24
2018-09-11    6.24

... ...

2018-12-05    4.65
2018-12-14    0.00

short position csvReader5[x] as X: 将csvReader5 [x]卖空作为X：

2018-09-06    1.11
2018-09-07    1.04
2018-09-10    1.61
2018-09-11    1.52
2018-09-12    1.61

..
2018-12-05    0.98
2018-12-14    7.00

This is my code to calculate confidence level 这是我用来计算置信度的代码

 y = numpy.array(csvReader5[x]).reshape(-1,1)
 X=numpy.array(df2[x]).reshape(-1,1)
 X = preprocessing.scale(X)

 X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)
 clf = LinearRegression()
 clf.fit(X_train, y_train)
 confidence = clf.score(X_test, y_test)
Out :-1.08

The confidence level I got changes every time I run it and it is always smaller than 1. I thought confidence level is the same as R square hence should always be between (0,1)? 每次运行时，我得到的置信度都会改变，并且始终小于1。我认为置信度与R方差相同，因此应该始终在（0,1）之间吗？

Answer 1

From sklearn documentation: 从sklearn文档中：

score(X, y, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction. 返回预测的确定系数R ^ 2。

The coefficient R^2 is defined as (1 - u/v) , where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum() . 系数R^2定义为(1 - u/v) ，其中u是平方的残差和((y_true - y_pred) ** 2).sum() ，v是平方的总和((y_true - y_true.mean()) ** 2).sum() 。 The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse) . 最佳可能分数是1.0， 并且可能为负（因为模型可能会更糟） 。 A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0. 不管输入特征如何，始终预测y的期望值的恒定模型将获得0.0的R ^ 2分数。

python线性回归的置信度小于0

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-12-15 20:40:43

python线性回归的置信度小于0

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-12-15 20:40:43

解决方案1
2 已采纳 2018-12-15 20:40:43