Currently train keras on tensorflow model with default setting - float32.
Post training the network is quantized: cast weights to float16. This improves performance by ~x3 while keeping the same accuracy.
I was trying to train from start using float16 and failed miserably. I cannot find any link that explain if that is possible and if not why is it not possible.
Automated Mixed Precision from NVidia might be a way to go.
From what I've gathered since 1.14
it is (was) supported in the upstream. All you would have to do is wrap your optimizer like this:
opt = tf.train.experimental.enable_mixed_precision_graph_rewrite(opt)
You might also need to set specific environment variable
from within your Python script, namely:
os.environ[‘TF_ENABLE_AUTO_MIXED_PRECISION’] = ‘1’
Above should already employ good mixed precision training practices (eg loss scaling, keeping float32
where necessary etc.).
Good resource for this solution should be official NVidia's documentation .
Some other resources gathered which also might be useful (though do not seem to indicate you would have to do anything more) here , here or here .
I would advise against manual casting as you might easily lose precision (eg in BatchNorm
statistics used during inference) unless you know ins-and-outs of specific layers.
Additionally, you might also check bfloat16
(brain float) type from Google which has exponent
part of float32
( 8
bits) and smaller fraction. This allows it to keep greater range of values (eg when computing gradients) when compared to float16
which allows one to avoid loss scaling
.
Above ( bfloat16
) should be useful mainly in TPUs, AFAIK NVidia GPU's support for it is not too great (someone correct me if I'm wrong). Some information here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.