My have the share prices df2[x] below as Y:
2018-09-05 6.22
2018-09-06 6.19
2018-09-07 6.22
2018-09-10 6.24
2018-09-11 6.24
...
2018-12-05 4.65
2018-12-14 0.00
short position csvReader5[x] as X:
2018-09-06 1.11
2018-09-07 1.04
2018-09-10 1.61
2018-09-11 1.52
2018-09-12 1.61
..
2018-12-05 0.98
2018-12-14 7.00
This is my code to calculate confidence level
y = numpy.array(csvReader5[x]).reshape(-1,1)
X=numpy.array(df2[x]).reshape(-1,1)
X = preprocessing.scale(X)
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)
clf = LinearRegression()
clf.fit(X_train, y_train)
confidence = clf.score(X_test, y_test)
Out :-1.08
The confidence level I got changes every time I run it and it is always smaller than 1. I thought confidence level is the same as R square hence should always be between (0,1)?
From sklearn documentation:
score(X, y, sample_weight=None)
Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2
is defined as (1 - u/v)
, where u is the residual sum of squares ((y_true - y_pred) ** 2).sum()
and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum()
. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse) . A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.