[英]How to run define Tensorflow graph were all variables are in float16 instead instead of float32
By default, the variables Tensorflow is in float32. 默认情况下,变量Tensorflow在float32中。 To save memory, I'm trying to run in float16. 为了节省内存,我试图在float16中运行。 In my graph, every place where I could define the datatype as float16, I did. 在我的图形中,我可以在每个可以将数据类型定义为float16的地方进行操作。 However, I get an error when I run the code 但是,运行代码时出现错误
Here's my code below. 这是我的下面的代码。
import math
import numpy as np
import tensorflow as tf
vocabulary_size = 10
batch_size = 64
embedding_size = 100
num_inputs =4
num_sampled = 128
graph = tf.Graph()
with graph.as_default(): #took out " , tf.device('/cpu:0')"
train_dataset = tf.placeholder(tf.int32, shape=[batch_size, num_inputs ])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
embeddings = tf.get_variable( 'embeddings', dtype=tf.float16,
initializer= tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0, dtype=tf.float16) )
softmax_weights = tf.get_variable( 'softmax_weights', dtype=tf.float16,
initializer= tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size), dtype=tf.float16 ) )
softmax_biases = tf.get_variable('softmax_biases', dtype=tf.float16,
initializer= tf.zeros([vocabulary_size], dtype=tf.float16), trainable=False )
embed = tf.nn.embedding_lookup(embeddings, train_dataset) #train data set is
embed_reshaped = tf.reshape( embed, [batch_size*num_inputs, embedding_size] )
segments= np.arange(batch_size).repeat(num_inputs)
averaged_embeds = tf.segment_mean(embed_reshaped, segments, name=None)
sam_sof_los = tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=averaged_embeds,
labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size)
loss = tf.reduce_mean( sam_sof_los )
optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)
saver = tf.train.Saver()
And this is this is the error message 这是错误消息
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
509 as_ref=input_arg.is_ref,
--> 510 preferred_dtype=default_dtype)
511 except TypeError as err:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx)
1143 if ret is None:
-> 1144 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
1145
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
980 "Tensor conversion requested dtype %s for Tensor with dtype %s: %r" %
--> 981 (dtype.name, t.dtype.name, str(t)))
982 return t
ValueError: Tensor conversion requested dtype float16 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Log:0", shape=(64, 1), dtype=float32)'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-2-12d508b9e5d7> in <module>()
46
47 sam_sof_los = tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=averaged_embeds,
---> 48 labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size)
49
50 loss = tf.reduce_mean( sam_sof_los )
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name, seed)
1347 partition_strategy=partition_strategy,
1348 name=name,
-> 1349 seed=seed)
1350 labels = array_ops.stop_gradient(labels, name="labels_stop_gradient")
1351 sampled_losses = nn_ops.softmax_cross_entropy_with_logits_v2(
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name, seed)
1126 if subtract_log_q:
1127 # Subtract log of Q(l), prior probability that l appears in sampled.
-> 1128 true_logits -= math_ops.log(true_expected_count)
1129 sampled_logits -= math_ops.log(sampled_expected_count)
1130
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py in binary_op_wrapper(x, y)
860 with ops.name_scope(None, op_name, [x, y]) as name:
861 if isinstance(x, ops.Tensor) and isinstance(y, ops.Tensor):
--> 862 return func(x, y, name=name)
863 elif not isinstance(y, sparse_tensor.SparseTensor):
864 try:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py in sub(x, y, name)
8316 if _ctx is None or not _ctx._eager_context.is_eager:
8317 _, _, _op = _op_def_lib._apply_op_helper(
-> 8318 "Sub", x=x, y=y, name=name)
8319 _result = _op.outputs[:]
8320 _inputs_flat = _op.inputs
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
544 "%s type %s of argument '%s'." %
545 (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 546 inferred_from[input_arg.type_attr]))
547
548 types = [values.dtype]
TypeError: Input 'y' of 'Sub' Op has type float32 that does not match type float16 of argument 'x'.
The error comes from line tf.nn.sampled_softmax_loss
. 错误来自tf.nn.sampled_softmax_loss
行。
At first I thought perhaps tf.segment_mean may cast the output as a float32, so I tried casting averaged_embeds to float16 but I still get the same error. 起初我以为tf.segment_mean可能会将输出强制转换为float32,所以我尝试将average_embeds强制转换为float16,但仍然遇到相同的错误。
From the documentation, there doesn't seem to be a way to define any data types in sampled_softmax_loss 根据文档,似乎没有一种方法可以在sampled_softmax_loss中定义任何数据类型
https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss
As far as I can tell, you can only do it using a hack. 据我所知,您只能使用hack来做到这一点。
The issue comes from the call to: 问题出在以下方面:
if sampled_values is None:
sampled_values = candidate_sampling_ops.log_uniform_candidate_sampler(
true_classes=labels,
num_true=num_true,
num_sampled=num_sampled,
unique=True,
range_max=num_classes,
seed=seed)
which outputs an object of this type: 输出以下类型的对象:
LogUniformCandidateSampler(
sampled_candidates=<tf.Tensor 'LogUniformCandidateSampler:0' shape=(128,) dtype=int64>,
true_expected_count=<tf.Tensor 'LogUniformCandidateSampler:1' shape=(64, 1) dtype=float32>,
sampled_expected_count=<tf.Tensor 'LogUniformCandidateSampler:2' shape=(128,) dtype=float32>
)
The hack would be to generate yourself the LogUniformCandidateSampler
, to cast its result as tf.float16
and pass it to tf.nn.sampled_softmax_loss
. 黑客将自己生成LogUniformCandidateSampler
,将其结果转换为tf.float16
并将其传递给tf.nn.sampled_softmax_loss
。
# Redefine it as the tensorflow one is not exposed.
LogUniformCandidateSampler = namedtuple("namedtuple", ["sampled_candidates", "true_expected_count", "sampled_expected_count"])
sampled_values = tf.nn.log_uniform_candidate_sampler(
true_classes=tf.cast(train_labels, tf.int64), num_sampled=num_sampled,
num_true=1,
unique=True,
range_max=vocabulary_size,
seed=None)
sampled_value_16 = LogUniformCandidateSampler(
sampled_values.sampled_candidates,
tf.cast(sampled_values.true_expected_count, tf.float16),
tf.cast(sampled_values.sampled_expected_count, tf.float16))
sam_sof_los = tf.nn.sampled_softmax_loss(
weights=softmax_weights,
biases=softmax_biases,
inputs=averaged_embeds,
labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size,
sampled_values=sampled_value_16)
But this is really a hack and it might have unexpected consequences (an expected one would be that the tf.cast
operation is not differentiable). 但这确实是一种黑客行为,并且可能会带来意想不到的后果(可以预料的是, tf.cast
操作不可区分)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.