Tensorflow 中的微分轮 function？

Question

So the output of my.network is a list of propabilities, which I then round using tf.round() to be either 0 or 1, this is crucial for this project.所以 my.network 的 output 是一个概率列表，然后我使用 tf.round() 将其舍入为 0 或 1，这对这个项目至关重要。 I then found out that tf.round isn't differentiable so I'm kinda lost there.. :/然后我发现 tf.round 是不可微分的，所以我有点迷路了……：/

Answer 1

Something along the lines of x - sin(2pi x)/(2pi)?类似于 x - sin(2pi x)/(2pi) 的东西？

I'm sure there's a way to squish the slope to be a bit steeper.我确信有一种方法可以将斜坡压得更陡一些。

Answer 2

You can use the fact that tf.maximum() and tf.minimum() are differentiable, and the inputs are probabilities from 0 to 1您可以使用 tf.maximum() 和 tf.minimum() 可微的事实，并且输入是从 0 到 1 的概率

# round numbers less than 0.5 to zero;
# by making them negative and taking the maximum with 0
differentiable_round = tf.maximum(x-0.499,0)
# scale the remaining numbers (0 to 0.5) to greater than 1
# the other half (zeros) is not affected by multiplication
differentiable_round = differentiable_round * 10000
# take the minimum with 1
differentiable_round = tf.minimum(differentiable_round, 1)

Example:示例：

[0.1,       0.5,     0.7]
[-0.0989, 0.001, 0.20099] # x - 0.499
[0,       0.001, 0.20099] # max(x-0.499, 0)
[0,          10,  2009.9] # max(x-0.499, 0) * 10000
[0,         1.0,     1.0] # min(max(x-0.499, 0) * 10000, 1)

Answer 3

Rounding is a fundamentally nondifferentiable function, so you're out of luck there.四舍五入是一个基本上不可微的函数，所以你在那里不走运。 The normal procedure for this kind of situation is to find a way to either use the probabilities, say by using them to calculate an expected value, or by taking the maximum probability that is output and choose that one as the network's prediction.这种情况的正常程序是找到一种方法来使用概率，例如通过使用它们来计算期望值，或者通过获取输出的最大概率并选择该概率作为网络的预测。 If you aren't using the output for calculating your loss function though, you can go ahead and just apply it to the result and it doesn't matter if it's differentiable.但是，如果您不使用输出来计算损失函数，则可以继续将其应用于结果，并且它是否可微无关紧要。 Now, if you want an informative loss function for the purpose of training the network, maybe you should consider whether keeping the output in the format of probabilities might actually be to your advantage (it will likely make your training process smoother)- that way you can just convert the probabilities to actual estimates outside of the network, after training.现在，如果您想要一个用于训练网络的信息损失函数，也许您应该考虑将输出保持为概率格式是否实际上对您有利（这可能会使您的训练过程更顺畅）-那样您可以在训练后将概率转换为网络外部的实际估计值。

Answer 4

This works for me:这对我有用：

x_rounded_NOT_differentiable = tf.round(x)
x_rounded_differentiable = (x - (tf.stop_gradient(x) - x_rounded_NOT_differentiable))

Answer 5

Building on a previous answer, a way to get an arbitrarily good approximation is to approximate round() using a finite Fourier approximation and use as many terms as you need.基于先前的答案，获得任意好的近似值的一种方法是使用有限傅立叶近似值来近似round()并根据需要使用尽可能多的项。 Fundamentally, you can think of round(x) as adding a reverse (ie descending) sawtooth wave to x .从根本上说，你能想到的round(x)添加一个反向（即降序）锯齿波x 。 So, using the Fourier expansion of the sawtooth wave we get所以，使用锯齿波的傅立叶展开我们得到

round(x) ≈ x + 1/π ∑_n^N (-1)^n sin(2π n x)/n

With N = 5, we get a pretty nice approximation:当N = 5 时，我们得到了一个非常好的近似值：

Answer 6

In range 0 1, translating and scaling a sigmoid can be a solution:在 0 1 范围内，平移和缩放 sigmoid 可以是一个解决方案：

  slope = 1000
  center = 0.5
  e = tf.exp(slope*(x-center))
  round_diff = e/(e+1)

Answer 7

Kind of an old question, but I just solved this problem for TensorFlow 2.0.一个老问题，但我刚刚为 TensorFlow 2.0 解决了这个问题。 I am using the following round function on in my audio auto-encoder project.我在我的音频自动编码器项目中使用以下轮函数。 I basically want to create a discrete representation of sound which is compressed in time.我基本上想创建一个在时间上被压缩的声音的离散表示。 I use the round function to clamp the output of the encoder to integer values.我使用 round 函数将编码器的输出限制为整数值。 It has been working well for me so far.到目前为止，它对我来说效果很好。

@tf.custom_gradient
def round_with_gradients(x):
    def grad(dy):
        return dy
    return tf.round(x), grad

Answer 8

In tensorflow 2.10, there is a function called soft_round which achieves exactly this.在 tensorflow 2.10 中，有一个名为soft_round的 function 正是实现了这一点。

Fortunately, for those who are using lower versions, the source code is really simple, so I just copy-pasted those lines , and it works like a charm:幸运的是，对于那些使用较低版本的人来说，源代码非常简单，所以我只是复制粘贴了这些行，它就像一个魅力：

def soft_round(x, alpha, eps=1e-3):
    """Differentiable approximation to `round`.

    Larger alphas correspond to closer approximations of the round function.
    If alpha is close to zero, this function reduces to the identity.

    This is described in Sec. 4.1. in the paper
    > "Universally Quantized Neural Compression"<br />
    > Eirikur Agustsson & Lucas Theis<br />
    > https://arxiv.org/abs/2006.09952

    Args:
    x: `tf.Tensor`. Inputs to the rounding function.
    alpha: Float or `tf.Tensor`. Controls smoothness of the approximation.
    eps: Float. Threshold below which `soft_round` will return identity.

    Returns:
    `tf.Tensor`
    """
    # This guards the gradient of tf.where below against NaNs, while maintaining
    # correctness, as for alpha < eps the result is ignored.
    alpha_bounded = tf.maximum(alpha, eps)


    m = tf.floor(x) + .5
    r = x - m
    z = tf.tanh(alpha_bounded / 2.) * 2.
    y = m + tf.tanh(alpha_bounded * r) / z


    # For very low alphas, soft_round behaves like identity
    return tf.where(alpha < eps, x, y, name="soft_round")

alpha sets how soft the function is. alpha设置 function 的软度。 Greater values leads to better approximations of round function, but then it becomes harder to fit since gradients vanish:更大的值导致 function 轮的更好近似，但随后变得更难拟合，因为梯度消失：

x = tf.convert_to_tensor(np.arange(-2,2,.1).astype(np.float32))

for alpha in [ 3., 7., 15.]:

    y = soft_round(x, alpha)
    plt.plot(x.numpy(), y.numpy(), label=f'alpha={alpha}')

plt.legend()
plt.title('Soft round function for different alphas')
plt.grid()

In my case, I tried different values for alpha, and 3. looks like a good choice.在我的例子中，我尝试了不同的 alpha 值，3. 看起来是个不错的选择。

Tensorflow 中的微分轮 function？

问题描述

8 个解决方案

解决方案1
16 2018-05-20 11:40:00

解决方案2
11 2017-12-07 17:12:31

解决方案3
4 已采纳 2017-10-06 01:50:33

解决方案4
3 2020-01-31 17:16:35

解决方案5
1 2020-10-08 23:22:55

解决方案6
0 2020-09-04 06:59:51

解决方案7
0 2021-05-15 00:42:31

解决方案8
0 2022-11-17 11:07:05

Tensorflow 中的微分轮 function？

问题描述

8 个解决方案

解决方案1 16 2018-05-20 11:40:00

解决方案2 11 2017-12-07 17:12:31

解决方案3 4 已采纳 2017-10-06 01:50:33

解决方案4 3 2020-01-31 17:16:35

解决方案5 1 2020-10-08 23:22:55

解决方案6 0 2020-09-04 06:59:51

解决方案7 0 2021-05-15 00:42:31

解决方案8 0 2022-11-17 11:07:05

解决方案1
16 2018-05-20 11:40:00

解决方案2
11 2017-12-07 17:12:31

解决方案3
4 已采纳 2017-10-06 01:50:33

解决方案4
3 2020-01-31 17:16:35

解决方案5
1 2020-10-08 23:22:55

解决方案6
0 2020-09-04 06:59:51

解决方案7
0 2021-05-15 00:42:31

解决方案8
0 2022-11-17 11:07:05