如何在自定义损失函数中迭代张量？

Question

I'm using keras with tensorflow backend.我正在使用带有 tensorflow 后端的 keras。 My goal is to query the batchsize of the current batch in a custom loss function.我的目标是查询batchsize当前批次的自定义损失函数。 This is needed to compute values of the custom loss functions which depend on the index of particular observations.这需要计算依赖于特定观察指数的自定义损失函数的值。 I like to make this clearer given the minimum reproducible examples below.考虑到下面的最小可重复示例，我想更清楚地说明这一点。

(BTW: Of course I could use the batch size defined for the training procedure and plugin it's value when defining the custom loss function, but there are some reasons why this can vary, especially if epochsize % batchsize (epochsize modulo batchsize) is unequal zero, then the last batch of an epoch has different size. I didn't found a suitable approach in stackoverflow, especially eg Tensor indexing in custom loss function and Tensorflow custom loss function in Keras - loop over tensor and Looping over a tensor because obviously the shape of any tensor can't be inferred when building the graph which is the case for a loss function - shape inference is only possible when evaluating given the data, which is only possible given the graph. Hence I need to tell the custom loss function to do something with particular elements along a certain dimension without knowing the length of the dimension. （顺便说一句：当然，我可以在定义自定义损失函数时使用为训练过程和插件定义的批量大小，但有一些原因会导致这种情况发生变化，尤其是当epochsize % batchsize (epochsize modulo batchsize) 不等于零时, 那么最后一批 epoch 的大小不同. 我在 stackoverflow 中没有找到合适的方法, 特别是例如自定义损失函数中的 Tensor indexing和Keras 中的 Tensorflow 自定义损失函数 - loop over tensor和Looping over a tensor因为显然在构建图形时无法推断任何张量的形状，这是损失函数的情况 - 形状推断仅在评估给定数据时才有可能，这仅在给定图形时才有可能。因此我需要告诉自定义损失函数在不知道维度长度的情况下，对某个维度上的特定元素执行某些操作。

(this is the same in all examples) （这在所有示例中都是相同的）

from keras.models import Sequential
from keras.layers import Dense, Activation

# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))

example 1: nothing special without issue, no custom loss示例 1：没有什么特别的问题，没有自定义丢失

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])    

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

(Output omitted, this runs perfectily fine) （省略输出，这运行得很好）

example 2: nothing special, with a fairly simple custom loss示例 2：没什么特别的，有一个相当简单的自定义损失

def custom_loss(yTrue, yPred):
    loss = np.abs(yTrue-yPred)
    return loss

model.compile(optimizer='rmsprop',
              loss=custom_loss,
              metrics=['accuracy'])

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

(Output omitted, this runs perfectily fine) （省略输出，这运行得很好）

example 3: the issue例3：问题

def custom_loss(yTrue, yPred):
    print(yPred) # Output: Tensor("dense_2/Sigmoid:0", shape=(?, 1), dtype=float32)
    n = yPred.shape[0]
    for i in range(n): # TypeError: __index__ returned non-int (type NoneType)
        loss = np.abs(yTrue[i]-yPred[int(i/2)])
    return loss

model.compile(optimizer='rmsprop',
              loss=custom_loss,
              metrics=['accuracy'])

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

Of course the tensor has not shape info yet which can't be inferred when building the graph, only at training time.当然，张量还没有形状信息，在构建图时无法推断，只有在训练时才能推断出来。 Hence for i in range(n) rises an error.因此for i in range(n)会产生错误。 Is there any way to perform this?有什么方法可以执行此操作吗？

The traceback of the output:输出的回溯：

------- -------

BTW here's my true custom loss function in case of any questions.顺便说一句，如果有任何问题，这是我真正的自定义损失函数。 I skipped it above for clarity and simplicity.为了清晰和简单起见，我在上面跳过了它。

def neg_log_likelihood(yTrue,yPred):
    yStatus = yTrue[:,0]
    yTime = yTrue[:,1]    
    n = yTrue.shape[0]    
    for i in range(n):
        s1 = K.greater_equal(yTime, yTime[i])
        s2 = K.exp(yPred[s1])
        s3 = K.sum(s2)
        logsum = K.log(y3)
        loss = K.sum(yStatus[i] * yPred[i] - logsum)
    return loss

Here's an image of the partial negative log-likelihood of the cox proportional harzards model.这是 cox 比例危害模型的部分负对数似然的图像。

This is to clarify a question in the comments to avoid confusion.这是为了澄清评论中的一个问题，以避免混淆。 I don't think it is necessary to understand this in detail to answer the question.我认为没有必要详细了解这一点来回答这个问题。

Answer 1

As usual, don't loop.像往常一样，不要循环。 There are severe performance drawbacks and also bugs.存在严重的性能缺陷和错误。 Use only backend functions unless totally unavoidable (usually it's not unavoidable)除非完全不可避免，否则仅使用后端函数（通常并非不可避免）

Solution for example 3:示例 3 的解决方案：

So, there is a very weird thing there...所以，有一个非常奇怪的事情......

Do you really want to simply ignore half of your model's predictions?您真的想简单地忽略模型预测的一半吗？ (Example 3) （示例 3）

Assuming this is true, just duplicate your tensor in the last dimension, flatten and discard half of it.假设这是真的，只需在最后一个维度复制你的张量，压平并丢弃它的一半。 You have the exact effect you want.你有你想要的确切效果。

def custom_loss(true, pred):
    n = K.shape(pred)[0:1]

    pred = K.concatenate([pred]*2, axis=-1) #duplicate in the last axis
    pred = K.flatten(pred)                  #flatten 
    pred = K.slice(pred,                    #take only half (= n samples)
                   K.constant([0], dtype="int32"), 
                   n) 

    return K.abs(true - pred)

Solution for your loss function:损失函数的解决方案：

If you have sorted times from greater to lower, just do a cumulative sum.如果您将时间从大到小排序，只需做一个累积和。

Warning: If you have one time per sample, you cannot train with mini-batches!!!警告：如果每个样本只有一次，则不能进行小批量训练！！！
batch_size = len(labels)

It makes sense to have time in an additional dimension (many times per sample), as is done in recurrent and 1D conv netoworks.在额外的维度（每个样本多次）有时间是有意义的，就像在循环和一维转换网络中所做的那样。 Anyway, considering your example as expressed, that is shape (samples_equal_times,) for yTime :无论如何，考虑到您所表达的示例，即yTime形状(samples_equal_times,) ：

def neg_log_likelihood(yTrue,yPred):
    yStatus = yTrue[:,0]
    yTime = yTrue[:,1]    
    n = K.shape(yTrue)[0]    


    #sort the times and everything else from greater to lower:
    #obs, you can have the data sorted already and avoid doing it here for performance

    #important, yTime will be sorted in the last dimension, make sure its (None,) in this case
    # or that it's (None, time_length) in the case of many times per sample
    sortedTime, sortedIndices = tf.math.top_k(yTime, n, True)    
    sortedStatus = K.gather(yStatus, sortedIndices)
    sortedPreds = K.gather(yPred, sortedIndices)

    #do the calculations
    exp = K.exp(sortedPreds)
    sums = K.cumsum(exp)  #this will have the sum for j >= i in the loop
    logsums = K.log(sums)

    return K.sum(sortedStatus * sortedPreds - logsums)

Answer 2

According to API specs for models , calling model.compile requires that parameter loss be a string (it is also None by default). 根据模型的API规范，调用model.compile要求参数loss为字符串（默认情况下也为None ）。 To that end, you might try calling like this: 为此，您可以尝试这样调用：

model.compile(...,
              loss="custom_loss",
              ...)

instead of the <...> string that is returned by custom_loss 而不是custom_loss返回的<...>字符串

如何在自定义损失函数中迭代张量？

问题描述

(this is the same in all examples) （这在所有示例中都是相同的）

example 1: nothing special without issue, no custom loss示例 1：没有什么特别的问题，没有自定义丢失

example 2: nothing special, with a fairly simple custom loss示例 2：没什么特别的，有一个相当简单的自定义损失

example 3: the issue例3：问题

------- -------

1 个解决方案

解决方案1
4 已采纳 2019-11-28 17:04:54

Solution for example 3:示例 3 的解决方案：

Solution for your loss function:损失函数的解决方案：

解决方案2
-2 2019-11-13 01:02:02

如何在自定义损失函数中迭代张量？

问题描述

(this is the same in all examples) （这在所有示例中都是相同的）

example 1: nothing special without issue, no custom loss示例 1：没有什么特别的问题，没有自定义丢失

example 2: nothing special, with a fairly simple custom loss示例 2：没什么特别的，有一个相当简单的自定义损失

example 3: the issue例3：问题

------- -------

1 个解决方案

解决方案1 4 已采纳 2019-11-28 17:04:54

Solution for example 3:示例 3 的解决方案：

Solution for your loss function:损失函数的解决方案：

解决方案2 -2 2019-11-13 01:02:02

解决方案1
4 已采纳 2019-11-28 17:04:54

解决方案2
-2 2019-11-13 01:02:02