简体   繁体   English

R中nnet函数的衰减参数的目的?

[英]Purpose of decay parameter in nnet function in R?

I am using nnet function in R to train my neural network. 我在R中使用nnet函数来训练我的神经网络。 I am not getting what is decay parameter in nnet is? 我没有得到nnet中的衰减参数是什么? Is this step size to be used in gradient descent mentod or regularization parameter used to overcome overfitting? 这个步长是用于梯度下降方法还是用于克服过度拟合的正则化参数?

It's regularization to avoid over-fitting. 这是正规化以避免过度拟合。

From the documentation (pdf) : 文档(pdf)

decay: parameter for weight decay. decay:体重衰减的参数。 Default 0. 默认值为0。

Further information is available in the authors' book, Modern Applied Statistics with S. Fourth Edition , page 245: 有关详细信息,请参阅作者的书“ 现代应用统计与S.第四版” ,第245页:

One way to ensure that f is smooth is to restrict the class of estimates, for example, by using a limited number of spline knots. 确保f平滑的一种方法是限制估计类别,例如,通过使用有限数量的样条结。 Another way is regularization in which the fit criterion is altered to 另一种方式是正则化 ,其中拟合标准被改变为

E + λC(f)

with a penalty C on the 'roughness' of f . 对f的'粗糙度'处以罚分C. Weight decay, specific to neural networks, uses as penalty the sum of squares of the weights wij. 权重衰减,特定于神经网络,使用权重wij的平方和作为惩罚。 ... The use of weight decay seems both to help the optimization process and to avoid over-fitting . ...使用重量衰减似乎既有助于优化过程,又可以避免过度补偿 (emphasis added) (重点补充)

Complementing blahdiblah 's answer by looking at the source code I think that parameter weights corresponds to the learning rate of back-propagation (by reading the manual I couldn't understand what it was). 通过查看源代码补充blahdiblah的答案我认为参数weights对应于反向传播的学习速率(通过阅读手册我无法理解它是什么)。 Look at the file nnet.c , line 236 , inside function fpass : 查看函数fpass中的文件nnet.c ,第236行:

TotalError += wx * E(Outputs[i], goal[i - FirstOutput]);

here, in a very intuitive nomenclature, E corresponds to the bp error and wx is a parameter passed to the function, which eventually corresponds to the identifier Weights[i] . 这里,在一个非常直观的命名法中, E对应于bp错误,而wx是传递给函数的参数,最终对应于标识符Weights[i]

Also you can be sure that the parameter decay is indeed what it claims to be by going to the lines 317~319 of the same file, inside function VR_dfunc : 此外,您可以通过在函数VR_dfunc中转到相同文件的第317~319行确定参数decay确实是它所声称的:

for (i = 0; i < Nweights; i++)
    sum1 += Decay[i] * p[i] * p[i];
*fp = TotalError + sum1;

where p corresponds to the connections' weights, which is the exact definition of the weight-decay regularization. 其中p对应于连接的权重,这是权重衰减正则化的确切定义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM