简体   繁体   English

为什么在使用 sklearn R2 函数时会得到 nan?

[英]why do I get nan when using sklearn R2 function?

I am always predicting the next value with an sklearn model.我总是用sklearn模型预测下一个值。

y1_test, y2_test, y3_test, y4_test = get_test_targets(df)

ypred1, ypred2, ypred3, ypred4 = ml_model(df, ElasticNet())

I would like to use sklearn to measure the r2 score of the y_true and y_predicted .我想使用 sklearn 来测量y_truey_predicted的 r2 分数。

np.array([y2_test])
>> array([6.75233645])

np.array([ypred2[0]])
array([6.75233645])

Using r2_score(np.array([y2_test]), np.array([ypred2[0]])) gives nan使用r2_score(np.array([y2_test]), np.array([ypred2[0]]))给出nan

I do not understand why I am getting nan我不明白为什么我越来越nan

There is a warning telling you what is wrong:有一个警告告诉你出了什么问题:

import numpy as np
from sklearn.metrics import r2_score

x = np.array([2.3])
y = np.array([2.1]) # exact values do not matter

r2_score(x, y)

Result:结果:

UndefinedMetricWarning: R^2 score is not well-defined with less than two samples.
  warnings.warn(msg, UndefinedMetricWarning)

nan

This should not be a surprise: the definition of R^2 is这应该不足为奇:R^2 的定义

R^2 = 1 - (total_sum_squares)/(residual_sum_squares)

but with only one sample both the nominator and the denominator of the fraction are 0, leading to a 0/0 division, which is indeed a nan (computationally, as well as mathematically).但是只有一个样本,分数的分母和分母都是 0,导致0/0除法,这确实是nan (在计算上和数学上)。

Bottom line: you should not use only a single pair of data to compute R^2;底线:你不应该只使用一对数据来计算 R^2; batch together more pairs of predictions & ground truth samples in order to get meaningful R^2 results.将更多对预测和真实样本组合在一起,以获得有意义的 R^2 结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么在使用.mean()时得到NaN - Why do I get NaN when using .mean() 在sklearn GridsearchCV SVM中使用class_weight时,为什么会出现错误? - Why do I get error when using class_weight in sklearn GridsearchCV SVM? 为什么在使用TensorFlow计算简单的线性回归时会得到[nan]? - Why do I get [nan] when using TensorFlow to calculate a simple linear regression? 如何从 sklearn GridSearchCV 获取 MSE 和 R2? - How to get both MSE and R2 from a sklearn GridSearchCV? 将 2D arrays 传递给 sklearn.metrics.recall_score 时,为什么会出现 ValueError? - Why do I get a ValueError, when passing 2D arrays to sklearn.metrics.recall_score? 向此 Dataframe 添加部分行时,为什么我得到 NaT 值而不是 NaN? - Why do I get NaT values rather than NaN when adding partial rows to this Dataframe? 为什么我会收到持久性 sklearn 模型的 unpickling 错误? - Why do I get unpickling error for persistent sklearn model? 将%r与元组一起使用时,为什么会得到“不是所有参数都转换了”的信息? - Why do I get “Not all arguments converted” when using %r with Tuples? 为什么更改 NaN 值后会出现 RecursionError - why do i get RecursionError after changing NaN values 使用范围功能填充列表时,为什么在使用%d打印时会出现列表格式错误? - When using range function to populate list why do I get list formatting error when printing with %d?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM