简体   繁体   中英

Bipolar Data Representation in Neural Networks

Why we use bipolar data representation in Neural Networks. For example -0.5 and 0.5 in place of 0 and 1 or -1 and 1 in place of 0 and 1. As in this Article http://www.codeproject.com/Articles/11285/Neural-Network-OCR?fid=206868&df=90&mpp=25&noise=3&prof=True&sort=Position&view=Normal&spc=Relaxed&fr=26#xx0xx

Your question is motivated,I'm guessing, by this statement from your ref:

But, in many neural network training tasks, it's preferred to represent training patterns in so called "bipolar" way , placing into input vector "0.5" instead of "1" and "-0.5" instead of "0".

There are two considerations that go in to the use of the 'bipolar' scaling:

a) The general choice of bipolar range is usually determined by the transfer functions used by the neural network in cases where the distribution of the input is guassian or similar ie. most of the values are centred around some mean with only a relatively small number of outliers. For example, if you use a logistic function for your nodes (output = [0,+1]) then you would scale your inputs between [0,+1]. Similarly, if you use a tanh function (output = [-1,+1] ), then you would scale your inputs similarly. All assuming that your inputs are continuous.

b) the range is further refined because of how learning takes place. NN learning usually uses the derivative of the transfer function, and best learning happens where there's greatest variation of the derivative for changes in input ie. steepest part of the transfer function. At either extreme of the transfer function, the curve flattens out the derivative is small, so learning is minimal/slow. To avoid those regions, if you are certain of the value range of your inputs, you scale them so that they lie well within the range of the steep part of the transfer function, typically say [-0.8, +0.8] for tanh(), but in your reference [-0.5, +0.5] for 'BipolarSigmoidFunction'.

TL;DR - choice of bipolar is determined by transfer function (your ref uses 'BipolarSigmoidFunction'), and bipolar values are arbitrary but centred on steepest part of transfer function curve.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM