[英]How to iterate through tensors in custom loss function?
I'm using keras with tensorflow backend.我正在使用带有 tensorflow 后端的 keras。 My goal is to query the
batchsize
of the current batch in a custom loss function.我的目标是查询
batchsize
当前批次的自定义损失函数。 This is needed to compute values of the custom loss functions which depend on the index of particular observations.这需要计算依赖于特定观察指数的自定义损失函数的值。 I like to make this clearer given the minimum reproducible examples below.
考虑到下面的最小可重复示例,我想更清楚地说明这一点。
(BTW: Of course I could use the batch size defined for the training procedure and plugin it's value when defining the custom loss function, but there are some reasons why this can vary, especially if epochsize % batchsize
(epochsize modulo batchsize) is unequal zero, then the last batch of an epoch has different size. I didn't found a suitable approach in stackoverflow, especially eg Tensor indexing in custom loss function and Tensorflow custom loss function in Keras - loop over tensor and Looping over a tensor because obviously the shape of any tensor can't be inferred when building the graph which is the case for a loss function - shape inference is only possible when evaluating given the data, which is only possible given the graph. Hence I need to tell the custom loss function to do something with particular elements along a certain dimension without knowing the length of the dimension. (顺便说一句:当然,我可以在定义自定义损失函数时使用为训练过程和插件定义的批量大小,但有一些原因会导致这种情况发生变化,尤其是当
epochsize % batchsize
(epochsize modulo batchsize) 不等于零时, 那么最后一批 epoch 的大小不同. 我在 stackoverflow 中没有找到合适的方法, 特别是例如自定义损失函数中的 Tensor indexing和Keras 中的 Tensorflow 自定义损失函数 - loop over tensor和Looping over a tensor因为显然在构建图形时无法推断任何张量的形状,这是损失函数的情况 - 形状推断仅在评估给定数据时才有可能,这仅在给定图形时才有可能。因此我需要告诉自定义损失函数在不知道维度长度的情况下,对某个维度上的特定元素执行某些操作。
from keras.models import Sequential
from keras.layers import Dense, Activation
# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)
(Output omitted, this runs perfectily fine) (省略输出,这运行得很好)
def custom_loss(yTrue, yPred):
loss = np.abs(yTrue-yPred)
return loss
model.compile(optimizer='rmsprop',
loss=custom_loss,
metrics=['accuracy'])
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)
(Output omitted, this runs perfectily fine) (省略输出,这运行得很好)
def custom_loss(yTrue, yPred):
print(yPred) # Output: Tensor("dense_2/Sigmoid:0", shape=(?, 1), dtype=float32)
n = yPred.shape[0]
for i in range(n): # TypeError: __index__ returned non-int (type NoneType)
loss = np.abs(yTrue[i]-yPred[int(i/2)])
return loss
model.compile(optimizer='rmsprop',
loss=custom_loss,
metrics=['accuracy'])
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)
Of course the tensor has not shape info yet which can't be inferred when building the graph, only at training time.当然,张量还没有形状信息,在构建图时无法推断,只有在训练时才能推断出来。 Hence
for i in range(n)
rises an error.因此
for i in range(n)
会产生错误。 Is there any way to perform this?有什么方法可以执行此操作吗?
The traceback of the output:输出的回溯:
BTW here's my true custom loss function in case of any questions.顺便说一句,如果有任何问题,这是我真正的自定义损失函数。 I skipped it above for clarity and simplicity.
为了清晰和简单起见,我在上面跳过了它。
def neg_log_likelihood(yTrue,yPred):
yStatus = yTrue[:,0]
yTime = yTrue[:,1]
n = yTrue.shape[0]
for i in range(n):
s1 = K.greater_equal(yTime, yTime[i])
s2 = K.exp(yPred[s1])
s3 = K.sum(s2)
logsum = K.log(y3)
loss = K.sum(yStatus[i] * yPred[i] - logsum)
return loss
Here's an image of the partial negative log-likelihood of the cox proportional harzards model.这是 cox 比例危害模型的部分负对数似然的图像。
This is to clarify a question in the comments to avoid confusion.这是为了澄清评论中的一个问题,以避免混淆。 I don't think it is necessary to understand this in detail to answer the question.
我认为没有必要详细了解这一点来回答这个问题。
As usual, don't loop.像往常一样,不要循环。 There are severe performance drawbacks and also bugs.
存在严重的性能缺陷和错误。 Use only backend functions unless totally unavoidable (usually it's not unavoidable)
除非完全不可避免,否则仅使用后端函数(通常并非不可避免)
So, there is a very weird thing there...所以,有一个非常奇怪的事情......
Do you really want to simply ignore half of your model's predictions?
您真的想简单地忽略模型预测的一半吗? (Example 3)
(示例 3)
Assuming this is true, just duplicate your tensor in the last dimension, flatten and discard half of it.假设这是真的,只需在最后一个维度复制你的张量,压平并丢弃它的一半。 You have the exact effect you want.
你有你想要的确切效果。
def custom_loss(true, pred):
n = K.shape(pred)[0:1]
pred = K.concatenate([pred]*2, axis=-1) #duplicate in the last axis
pred = K.flatten(pred) #flatten
pred = K.slice(pred, #take only half (= n samples)
K.constant([0], dtype="int32"),
n)
return K.abs(true - pred)
If you have sorted times from greater to lower, just do a cumulative sum.如果您将时间从大到小排序,只需做一个累积和。
Warning: If you have one time per sample, you cannot train with mini-batches!!!
警告:如果每个样本只有一次,则不能进行小批量训练!!!
batch_size = len(labels)
It makes sense to have time in an additional dimension (many times per sample), as is done in recurrent and 1D conv netoworks.在额外的维度(每个样本多次)有时间是有意义的,就像在循环和一维转换网络中所做的那样。 Anyway, considering your example as expressed, that is shape
(samples_equal_times,)
for yTime
:无论如何,考虑到您所表达的示例,即
yTime
形状(samples_equal_times,)
:
def neg_log_likelihood(yTrue,yPred):
yStatus = yTrue[:,0]
yTime = yTrue[:,1]
n = K.shape(yTrue)[0]
#sort the times and everything else from greater to lower:
#obs, you can have the data sorted already and avoid doing it here for performance
#important, yTime will be sorted in the last dimension, make sure its (None,) in this case
# or that it's (None, time_length) in the case of many times per sample
sortedTime, sortedIndices = tf.math.top_k(yTime, n, True)
sortedStatus = K.gather(yStatus, sortedIndices)
sortedPreds = K.gather(yPred, sortedIndices)
#do the calculations
exp = K.exp(sortedPreds)
sums = K.cumsum(exp) #this will have the sum for j >= i in the loop
logsums = K.log(sums)
return K.sum(sortedStatus * sortedPreds - logsums)
According to API specs for models , calling model.compile
requires that parameter loss
be a string (it is also None
by default). 根据模型的API规范 ,调用
model.compile
要求参数loss
为字符串(默认情况下也为None
)。 To that end, you might try calling like this: 为此,您可以尝试这样调用:
model.compile(...,
loss="custom_loss",
...)
instead of the <...>
string that is returned by custom_loss
而不是
custom_loss
返回的<...>
字符串
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.