Neural Network architecture to learn function

Question

I'm designing a neural network to learn a function of the form:

P = (x^2 * y)/z

I'm using the keras library to build the neural net with:

def custom_activation(x):
    return keras.backend.square(x)

def custom_activation2(x):
    return keras.backend.pow(x,-1)

get_custom_objects().update({'custom_activation': Activation(custom_activation)})
get_custom_objects().update({'custom_activation2': Activation(custom_activation2)})

model = Sequential()
model.add(Dense(512, input_dim=3, activation='relu'))
model.add(Dense(256,activation='relu'))
model.add(Dense(1,activation = custom_activation))

Using this architecture I'm able to learn a function as shown in the blue line of the image. It seems that using custom_activation , I'm able to incorporate x^2 part of the equation.

How do I change my architecture to incorporate the y and the z part of the equation?

I do have the custom_activation2 , but not sure how to add it in my architecture?

Answer 1

I think there is a misconception here. If you already now that your function has the form x^2*y/z , then you do not need to use a neural network to learn this function. Especially, you do not have to invent new activation functions which resemble your ground truth function. A neural network is often considered as a "universal approximator", see Wiki . Loosely speaking that means that a sufficiently large feed forward network can approximate any continuous function.

So, if you already now that your function looks like f(x,y,z)=x^2*y/z and you only look for some parameters in your function to fit it to your data, then have a look at nonlinear regression, for example with scipy .

If you insist on using a neural network, then leave it to the network to find the right parameters use standard activation functions and topologies. This is often a good start and refinements can be done later.

To see that this works, look at this example code, where a simple feed forward network is trained. Notice that I did not spend much time on the architecture so there is plenty of room for improving this.

from keras.models import Sequential
from keras.layers import Dense
from keras import losses
from keras import regularizers
import numpy as np
import matplotlib.pyplot as plt


def gen_data(batch_size=100):
    while True:
        X = np.zeros((batch_size, 3))
        x1 = np.random.random(batch_size) * 10 + 1
        x2 = np.random.random(batch_size) * 10 + 1
        x3 = np.random.random(batch_size) * 10 + 1
        y = x1 * x1 * x2 / x3
        X[:, 0] = x1
        X[:, 1] = x2
        X[:, 2] = x3
        yield X, y


def plot_loss(history):
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()


def fit_model():
    model = Sequential()
    model.add(Dense(1024, input_dim=3, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
    model.add(Dense(512, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
    model.add(Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
    model.add(Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
    model.add(Dense(1, activation=None))
    model.compile(optimizer="Adam", loss=losses.mean_squared_error)
    history = model.fit_generator(gen_data(), steps_per_epoch=200, epochs=15, validation_data=gen_data(),
                                  validation_steps=200)
    return history


history = fit_model()
plot_loss(history)

This results in a loss curve that looks like this:

Neural Network architecture to learn function

Question

1 answers

solution1
1 ACCPTED 2018-09-08 10:20:25

Neural Network architecture to learn function

Question

1 answers

solution1 1 ACCPTED 2018-09-08 10:20:25

solution1
1 ACCPTED 2018-09-08 10:20:25