简体   繁体   English

如何在 Tensorflow Keras 中使用 tf.nn.sampled_softmax_loss?

[英]How to use tf.nn.sampled_softmax_loss with Tensorflow Keras?

I've been looking around a way to use sampled softmax tf.nn.sampled_softmax_loss() for one of my models.我一直在寻找一种对我的一个模型使用采样 softmax tf.nn.sampled_softmax_loss()的方法。 I couldn't find any post that could help me on how to implement it.我找不到任何可以帮助我了解如何实施它的帖子。

If anyone has implemented it with Keras architecture, would you please let me know how to use it with keras?如果有人用 Keras 架构实现了它,请告诉我如何在 keras 中使用它?

Right now for other losses I could just use,现在对于我可以使用的其他损失,

model.compile(loss=tf.keras.losses.CategoricalCrossentropy())

But I can't use tf.nn.sampled_softmax_loss in that manner model.compile(loss=tf.nn.sampled_softmax_loss()) ?但我不能以这种方式使用 tf.nn.sampled_softmax_loss model.compile(loss=tf.nn.sampled_softmax_loss())

I've tried using model.compile(loss=tf.nn.sampled_softmax_loss()) but it returned error which I think it's correct because it takes in weights and biases from last layer to calculate loss which I'm not sure how to implement in keras.我试过使用model.compile(loss=tf.nn.sampled_softmax_loss())但它返回错误,我认为这是正确的,因为它从最后一层获取权重和偏差来计算我不知道如何实现的损失在凯拉斯。

sampled_softmax_loss() computes and returns the sampled softmax training loss. sampled_softmax_loss() 计算并返回采样的 softmax 训练损失。

This is a faster way to train a softmax classifier over a huge number of classes.这是在大量类别上训练 softmax 分类器的更快方法。

This operation is for training only.此操作仅用于培训。 It is generally an underestimate of the full softmax loss.它通常低估了完整的 softmax 损失。

A common use case is to use this method for training, and calculate the full softmax loss for evaluation or inference.一个常见的用例是使用这种方法进行训练,并计算完整的 softmax 损失以进行评估或推理。 In this case, you must set partition_strategy="div" for the two losses to be consistent, as in the following example:在这种情况下,您必须设置partition_strategy="div"以使两个损失保持一致,如下例所示:

if mode == "train":
  loss = tf.nn.sampled_softmax_loss(
      weights=weights,
      biases=biases,
      labels=labels,
      inputs=inputs,
      ...,
      partition_strategy="div")
elif mode == "eval":
  logits = tf.matmul(inputs, tf.transpose(weights))
  logits = tf.nn.bias_add(logits, biases)
  labels_one_hot = tf.one_hot(labels, n_classes)
  loss = tf.nn.softmax_cross_entropy_with_logits(
      labels=labels_one_hot,
      logits=logits)  

Where regular loss functions like CategoricalCrossentropy() uses it's default values, even if you don't pass any arguments it will calculate the loss based on its default values.像 CategoricalCrossentropy() 这样的常规损失函数使用它的默认值,即使您不传递任何参数,它也会根据其默认值计算损失。

The key point for sampled_softmax_loss is to pass right shape of weight , bias , input and label . sampled_softmax_loss 的关键是传递正确的weightbiasinputlabel形状。
The shape of weight passed to sampled_softmax is not the the same with the general situation.传递给sampled_softmax的权重形状和一般情况不太一样。
For example, logits = xw + b , call sampled_softmax like this:例如, logits = xw + b ,像这样调用 sampled_softmax :

sampled_softmax(weight=tf.transpose(w), bias=b, inputs=x) , sampled_softmax(weight=tf.transpose(w), bias=b, inputs=x)
NOT sampled_softmax(weight=w, bias=b, inputs=logits) !! NOT sampled_softmax(weight=w, bias=b, inputs=logits) !!

Besides, label is not one-hot representation.此外,标签不是单热表示。 if your labels are one-hot represented, pass labels=tf.reshape(tf.argmax(labels_one_hot, 1), [-1,1])如果您的标签是单热表示,请传递labels=tf.reshape(tf.argmax(labels_one_hot, 1), [-1,1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM