简体   繁体   English

马修斯相关系数作为 keras 的损失

[英]Matthews correlation coefficient as a loss in keras

I try to write a custom loss function for keras with tf backend.我尝试使用 tf 后端为 keras 编写自定义损失函数。 I get the following error我收到以下错误

ValueError: An operation has None for gradient. ValueError:一个操作None梯度。 Please make sure that all of your ops have a gradient defined (ie are differentiable).请确保您的所有操作都定义了梯度(即可微分)。 Common ops without gradient: K.argmax, K.round, K.eval.没有梯度的常见操作:K.argmax、K.round、K.eval。

def matthews_correlation(y_true, y_pred):
    y_pred_pos = K.round(K.clip(y_pred, 0, 1))
    y_pred_neg = 1 - y_pred_pos

    y_pos = K.round(K.clip(y_true, 0, 1))
    y_neg = 1 - y_pos

    tp = K.sum(y_pos * y_pred_pos)
    tn = K.sum(y_neg * y_pred_neg)

    fp = K.sum(y_neg * y_pred_pos)
    fn = K.sum(y_pos * y_pred_neg)

    numerator = (tp * tn - fp * fn)
    denominator = K.sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))

    return 1.0 - numerator / (denominator + K.epsilon())

If I use this function as a metric and not as the loss function it works.如果我将此函数用作指标而不是用作损失函数,则它起作用。 How can I use this function as a loss?我怎样才能使用这个函数作为损失?

After removing K.round I get following error:删除 K.round 后,我收到以下错误:

InvalidArgumentError: Can not squeeze dim[0], expected a dimension of 1, got 8 [[{{node loss_9/dense_10_loss/Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1], _device="/job:localhost/replica:0/task:0/device:GPU:0"] (_arg_dense_10_sample_weights_0_2/_2445)]] [[{{node loss_9/add_12/_2467}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6418_loss_9/add_12", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]] InvalidArgumentError: 无法挤压dim[0],期望维度为1,得到8 [[{{node loss_9/dense_10_loss/Squeeze}} = Squeeze[T=DT_FLOAT,squeeze_dims=[-1], _device="/job: localhost/replica:0/task:0/device:GPU:0"] (_arg_dense_10_sample_weights_0_2/_2445)]] [{{node loss_9/add_12/_2467}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica :0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6418_loss_9/add_12", tensor_type =DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

The answer is: You can't答案是:你不能

let me explain a little why.让我解释一下原因。 First we need to define a few things:首先我们需要定义一些东西:

  • loss: a loss function or cost function is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event.损失:损失函数或成本函数是一种将事件或一个或多个变量的值映射到直观地表示与事件相关的某些“成本”的实数的函数。 An optimization problem seeks to minimize a loss function一个优化问题寻求最小化损失函数

  • metric: In mathematics, a metric or distance function is a function that defines a distance between each pair of elements of a set度量:在数学中,度量或距离函数是定义集合中每对元素之间距离的函数

  • optmizer: a way to optimize (minimize) a cost function. optmizer:一种优化(最小化)成本函数的方法。

now why can't we use the True positive rate as a loss function?现在为什么我们不能使用真阳性率作为损失函数? Well because you can't minimize it.因为你不能最小化它。 It is not convex.它不是凸面的。 So you can't define the cost of the prediction individually.所以你不能单独定义预测的成本。 As you can see from the definition it's a cost function that depends on all the answers to calculate a rate.正如您从定义中看到的那样,它是一个成本函数,它取决于计算速率的所有答案。 You can't calculate it for 1 sample.您无法为 1 个样本计算它。

What you can do?你可以做什么?

use it as a metric and use early stopping while following the evolution of this metric and get the best iteration.将其用作指标并在跟踪该指标的演变时使用提前停止并获得最佳迭代。

@Alexis has already given the answer to the error message, but I want to clarify something about loss functions which are derived from metrics: @Alexis 已经给出了错误消息的答案,但我想澄清一些关于从指标派生的损失函数的内容:

In general metrics can not be used as loss functions, but often smoothed version of metrics like the dice measure (=F1 score) CH Sudre 2014 can be applied as loss functions.通常指标不能用作损失函数,但通常平滑版本的指标,如骰子度量(=F1 分数) CH Sudre 2014可以用作损失函数。 One usecase might be image segmentation.一种用例可能是图像分割。

(Please excuse, that I have to add this comment as an answer, since I do not have enough reputation to add a comment) (请原谅,我必须添加此评论作为答案,因为我没有足够的声誉来添加评论)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM