Azure 机器学习与 Python 的过拟合/欠拟合机器学习模型

Question

I'm learning how to perform Machine Learning with Azure ML Studio.我正在学习如何使用 Azure ML Studio 执行机器学习。 At the moment, I've only played around with Machine Learning using Python.目前，我只玩过使用 Python 的机器学习。

I have run identical Machine Learning projects using both Azure ML and Python to see how close the results of each product with the Root Mean Squared Errors (RMSE).我使用 Azure ML 和 Python 运行了相同的机器学习项目，以查看每个产品的结果与均方根误差 (RMSE) 的接近程度。 So far the RMSE has been widely different for Azure ML and Python.到目前为止，Azure ML 和 Python 的 RMSE 已经大不相同。

I can't figure out why the RMSE is so far apart.我不明白为什么 RMSE 相差这么远。 The only reason I can think of is because of the way Python 'fits' the model on the training data.我能想到的唯一原因是 Python 在训练数据上“拟合”模型的方式。 Python uses the following code to fit the training data Python使用以下代码拟合训练数据

lr = LinearRegression(labelCol='xxxx')
lrModel = lr.fit(train_data)

However, I don't know how Azure ML fits the training data.但是，我不知道 Azure ML 如何拟合训练数据。

Can someone let me know how Azure ML accomplishes fitting the training data?有人可以让我知道 Azure ML 如何完成训练数据的拟合吗？

Answer 1

Im guessing you probably used RMSE = √( 1/n ∑ (y_i - pred_i)^2 ) to calculate the RMSE in python, where y are the true labels, pred are the predicted ones in the linear regression?我猜你可能用 RMSE = √( 1/n ∑ (y_i - pred_i)^2 ) 来计算 Python 中的 RMSE，其中 y 是真实标签，pred 是线性回归中的预测标签？

I can imagine Azure using a slightly different term, namely RSE = √( 1/(n-2) ∑ (y_i - pred_i)^2 ), in which the Bessel's Correcting term 1/(n-2) is used instead.我可以想象 Azure 使用稍微不同的术语，即 RSE = √( 1/(n-2) ∑ (y_i - pred_i)^2 )，其中使用贝塞尔校正项 1/(n-2) 代替。 This is used to correct for the bias of fitting 2 parameters (assuming the linear regression is only fitting the slope and the intercept, otherwise 1/(nk) is the Bessel's Correction when fitting k parameters in a multivariate linear regression)这用于校正拟合 2 个参数的偏差（假设线性回归仅拟合斜率和截距，否则 1/(nk) 是在多元线性回归中拟合 k 个参数时的贝塞尔校正）

Try it out!试试看！ I cannot explain however why the difference between python and Azure is so large, since the Bessel's Correcting term should only make a small difference.然而，我无法解释为什么 python 和 Azure 之间的差异如此之大，因为 Bessel 的校正项应该只会产生很小的差异。

Azure 机器学习与 Python 的过拟合/欠拟合机器学习模型

问题描述

1 个解决方案

解决方案1
0 2019-12-28 11:42:06

Azure 机器学习与 Python 的过拟合/欠拟合机器学习模型

问题描述

1 个解决方案

解决方案1 0 2019-12-28 11:42:06

解决方案1
0 2019-12-28 11:42:06