简体   繁体   中英

Massive neural network training time increase by inverting images in a data set

I have been working with neural networks for a few months now and I have a little mystery that I can't solve on my own.

I wanted to create and train a neural network which can identify simple geometric shapes (squares, circles, and triangles) in 56*56 pixel greyscale images. If I use images with a black background and a white shape, everything work pretty well. The training time is about 18 epochs and the accuracy is pretty close to 100% (usually 99.6 % - 99.8%).

But all that changes when I invert the images (ie, now a white background and black shapes). The training time skyrockets to somewhere around 600 epochs and during the first 500-550 epochs nothing really happens. The loss barely decreases in those first 500-550 epochs and it just seems like something is "stuck".

Why does the training time increase so much and how can I reduce it (if possible)?

Color inversion

You have to essentially “switch” WxH pixels, hence touching every possible pixel during augmentation for every image , which amounts to lots of computation.

In total it would be DxWxH operations per epoch (D being size of your dataset).

You might want to precompute these and feed your neural network with them afterwards.

Loss

It is harder for neural networks as white is encoded with 1, while black is encoded with 0. Inverse giving us 0 for white and 1 for black.

This means most of neural network weights are activated by background pixels!

What is more, every sensible signal (0 in case of inversion) is multiplied by zero value and has not effect on final loss.

With hard {0, 1} encoding neural network tries to essentially get signal from the background (now 1 for black pixels) which is mostly meaningless (each weight will tend to zero or almost zero as it bears little to no information) and what it does instead is fitting distribution to your labels (if I predict 1 only I will get smaller loss, no matter the input).

Experiment if you are bored

Try to encode your pixels with smooth values, eg white being 0.1 and black being 0.9 (although 1.0 might work okayish, although more epochs might be needed as 0 is very hard to obtain via backprop) and see what the results are.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM