简体   繁体   中英

Error processing data of an array of different widths by neural network

I have an error in transmitting the "Input" array to the neural network for learning. I need to teach a neural network an array in which there is a different number of columns in some lines. I assign the number of neurons to the maximum value of the array row. I use the encog library. Is it possible to do this? Please help, because I am a beginner in this.

I tried to reduce the number of neurons in the neural network, but some data from the array was not used. I tried to find some information on this, but without success.

//array with data for training
public static double[][] INPUT = {
{1.0, 8.0, 13.0, 0,0, 12.0, 6.0, 17.0, 24.0, 440.0, 6.0, 0.0, 19.0,96.0}, 
{1.0, 0,0, 0.0, 4.0, 52.0, 6.0, 0.0, 5.0, 6.0, 7.0, 150.0, 5.0, 1.0},
{0.0, 0.0, 0.0, 0.0, 0.0, 413.0, 0.0, 117.0, 0.0, 0.0, 0.0}, 
{1.0, 1.0, 1.0, 7.0, 0.0, 3.0, 7.0, 167.0, 1.0, 7.0, 0.0, 1.0, 44.0}, 
{0.0, 1.0, 5.0, 5.0, 5.0, 6.0, 0.0, 4.0, 186.0, 13.0, 0.0, 1.0}
};

//ideal data for neural network
public static double[][] IDEAL = {{0.9, 0.1}, {0.3, 0.7}, {0.2, 0.8}, 
{1.0, 0.0}, {0.4, 0.6}};

Here neural network structure

BasicNetwork network = new BasicNetwork();
network.addLayer(new BasicLayer(null, true, 13));
network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 9));
network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 2));
network.getStructure().finalizeStructure();
network.reset();

MLDataSet trainSet = new BasicMLDataSet(INPUT, IDEAL);
MLTrain train = new ResilientPropagation(network, trainSet);

    int epoch = 1;

    do {
        train.iteration();
        System.out.println("Epoch #" + epoch + " Error:" + 
train.getError());
        epoch++;
    } while (train.getError() > 0.01);
    train.finishTraining();

Above you have the input neuron count set to 13, so Encog will require that you always submit 13 inputs.

Its hard to answer without knowing what the data represent. What do the columns represent? Why do you sometimes have different amounts? Are they missing values? If they are missing values then you should always have an input vector of 13 and find a way to approximate the missing values that make sense.

Simply omitting missing values is going to be problematic, because you might have values going to the wrong input neuron if you simply remove missing values and shift everything to the left.

If the arrays are of different lengths because they are some sort of time series data of varying sequence length, the you either need to use a different type of encoding or a time series type model (other than a feedforward neural network).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM