简体   繁体   English

一个好的模型可以有一个低的 R 平方值吗?

[英]Can a good model have a low R square value?

I made linear regression using scikit learn我使用 scikit learn 进行了线性回归

when I see my mean squared error on the test data then it's very low (0.09)当我在测试数据上看到我的均方误差时,它非常低(0.09)

when I see my r2_score on my test data then it's also very less (0.05)当我在测试数据上看到我的 r2_score 时,它​​也非常小(0.05)

as per i know when mean squared error is low that present model is good but r2_score is very less that tells us model is not good据我所知,当均方误差很低时,当前模型很好,但 r2_score 非常小,这告诉我们模型不好

I don't understand that my regression model is good or not我不明白我的回归模型好不好

Can a good model has a low R square value or can a bad model has a low mean square error value?一个好的模型是否可以具有较低的 R 平方值,或者一个坏模型是否可以具有较低的均方误差值?

R^2 is measure of, how good your fit is representing the data. R^2 是衡量您的拟合代表数据的程度。

Let's say your data has a linear trend and some noise on it.假设您的数据有一个线性趋势和一些噪音。 We can construct the data and see how the R^2 is changing:我们可以构建数据,看看 R^2 是如何变化的:

Data数据

I'm going to create some data using numpy :我将使用numpy创建一些数据:

xs = np.random.randint(10, 1000, 2000)
ys = (3 * xs + 8) + np.random.randint(5, 10, 2000)

小散点数据

Fit合身

Now we can create a fit object usinh scikit现在我们可以使用 scikit 创建一个合适的对象

reg = LinearRegression().fit(xs.reshape(-1, 1), ys.reshape(-1, 1))

And we can get the score from this fit.我们可以从这个拟合中得到分数。

reg.score(xs.reshape(-1, 1), ys.reshape(-1, 1))

My R^2 was: 0.9999971914416896我的 R^2 是: 0.9999971914416896

Bad data坏数据

Let's say we have a set of more scattered data (have more noise on it).假设我们有一组更分散的数据(上面有更多的噪音)。

ys2 = (3 * xs + 8) + np.random.randint(500, 1000, 2000)

非常分散的数据

Now we can calculate the score of the ys2 to understand how good our fit represent the xs , ys2 data:现在我们可以计算ys2的分数以了解我们的拟合代表xsys2数据的程度:

reg.score(xs.reshape(-1, 1), ys2.reshape(-1, 1))

My R^2 was: 0.2377175028951054我的 R^2 是: 0.2377175028951054

The score is low.分数很低。 we know the trend of the data did not change.我们知道数据的趋势没有改变。 It still is 3x+8 + (noise).它仍然是 3x+8 +(噪声)。 But ys2 are further away from the fit.但是ys2离拟合更远。

So, R^2 is an inductor of how good your fit is representing the data.因此,R^2 是您的拟合代表数据的电感器。 But the condition of the data itself is important.但数据本身的状况很重要。 Maybe even with low score the best possible fit is what you get.也许即使分数很低,你得到的也是最合适的。 Since the data is scattered due to noise.由于数据由于噪声而分散。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 你们能推荐一个 model 最适合这个数据集的低 r 平方估计吗? - Can you guys recommend one model that has the best fit to this dataset with low r square estimation? 如果我们有这么多的csv文件,有什么方法可以动态获取r平方值或使用循环? - Is there any way we can get r-square value dynamically or using loop if we have so many csv files? 绘制回归模型结果的 R 平方误差 - Plotting the R square error for regression model results 试图提取价值较低的列表名称。 无法让逻辑工作 - Trying to pull out names of lists which have a low value. Can't get logic to work 为什么我的 Model 同时具有低 MAE 和低 R2 分数? - Why my Model has a low MAE and low R2 score at the same time? 分类 model 产生极低的测试准确度,尽管训练和验证准确度对多类分类有好处 - Classification model produces extremely low test accuracy, although training and validation accuracies are good for multiclass classification 模型训练收敛到固定的损失值,准确率低 - Model training converges to a fixed value of loss with low accuracy python和R中卡方检验的不同p值 - different p-value for chi-square test in python and R 如何优雅地计算 pandas dataframe 中的 r 平方? - How can i calculate r square in pandas dataframe elegantly? 我如何知道我的神经网络是否使用 Mean_Square_Error (Keras) 运行良好 - How can I know if my Neural Network is doing good or not using Mean_Square_Error (Keras)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM