简体   繁体   English

在Tensorflow中训练网络时忽略特定的输入损耗

[英]Disregarding specific input losses when training network in Tensorflow

I have a binary classification problem in which only the beginning 3 time instances of positive events are annotated in the data set, and I have apriori knowledge that they could last up to 15 time instances. 我有一个二进制分类问题,在该问题中,仅在数据集中注释了正面事件的前三个时间实例,并且我先验地知道它们最多可以持续15个时间实例。 To cope with this issue, I have decided to disregard loss values from the 15-3 = 12 time instances following an annotation when training the network. 为了解决这个问题,我决定在训练网络时忽略注释后的15-3 = 12个时间实例的损耗值。

I am training an LSTM network using Tensorflow in Python, and my training batches have sequence_len=240 , with the positive events likely to occur at any time on the sequence in any iteration. 我正在使用Python中的Tensorflow训练LSTM网络,并且我的训练批次具有sequence_len=240 ,在任何迭代中的序列上随时可能发生积极事件。

Basically, my cost metric is (using AdamOptimizer ) 基本上,我的费用指标是(使用AdamOptimizer

loss = tf.nn.softmax_cross_entropy_with_logits(logits, self._targets)
cost = tf.reduce_mean(loss)

I am thinking I simply need to remove the undesired elements from loss before passing them to tf.reduce_mean() . 我以为我只需要从loss删除不需要的元素,然后再将它们传递给tf.reduce_mean() I developed an algorithm to do a desired mask under the assumption of targets being a numpy array: 我开发了一种算法,可以在目标为numpy数组的假设下制作所需的蒙版:

v = targets[:, 0]
w = np.where(np.multiply(v[:-1] == 1, v[1:] < 1))[0]
m = np.ones(v.shape, dtype=bool)
    for i in w:
    i1 = i + 1
    i2 = np.min((i + 13, len(v)))
    ind = np.arange(i1, i2, dtype=np.int)
    m[ind] = False
return m

This algorithm works, but not in Tensorflow scope! 该算法有效,但不在Tensorflow范围内! Obviously because the input is then a tensor and not a numpy array. 显然是因为输入是张量而不是numpy数组。

Hence, my question: How do I migrate this small algorithm/mask to Tensorflow? 因此,我的问题是: 如何将这种小的算法/掩码迁移到Tensorflow?

Quickly after posting this question I came up with the obvious answer myself. 发布此问题后,我很快就想到了一个显而易见的答案。 I will include a placeholder for "error weights" in the network: 我将在网络中包含“错误权重”的占位符:

self._mask = tf.placeholder(tf.float32, [None])

Then in the cost function I will perform a weighted loss on these. 然后在成本函数中,我将对它们执行加权损失。

loss = tf.nn.softmax_cross_entropy_with_logits(logits, self._targets)
cost = tf.reduce_sum(tf.mul(loss, self._mask) / tf.reduce_sum(self._mask)

And then generate the mask alongside my input data. 然后在我的输入数据旁边生成掩码。 This is also better because I can make up for positive events occuring just prior to a new sequence, which the previous solution would not be able to. 这也更好,因为我可以弥补新序列之前发生的积极事件,而以前的解决方案将无法做到。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM