简体   繁体   中英

Best practice for controlling Pytorch's neural network layers' number and size?

I'm looking for best practices for controlling/adjusting the number of layers and also their sizes in Pytorch neural networks in general.

I have a configuration file in which I specify values for particular experiment variables. Additionally, I'd like to have an option in this file to determine the number and size of Pytorch's neural networks layers.

Current solution:

config.py

ACTOR_LAYER_SIZES: (128, 256, 128)

network.py

# input_size: int
# output_size: int
# layer_sizes = ACTOR_LAYER_SIZES

    layers = [
        nn.Linear(input_size, layer_sizes[0]),
        nn.ReLU(),
        BatchNorm1d(layer_sizes[0]),
    ]
    layers += list(
        chain.from_iterable(
            [
                [
                    nn.Linear(n_size, next_n_size),
                    nn.ReLU(),
                    BatchNorm1d(next_n_size),
                ]
                for n_size, next_n_size in zip(layer_sizes, layer_sizes[1:])
            ]
        )
    )
    layers += [(nn.Linear(layer_sizes[-1], output_size))]

    network = nn.Sequential(*layers)

I wonder if using chain.from_iterable may be considered here as the best practice in general. Furthermore, this code appears to be a little lengthy. Maybe there is a better way to do that?

I use the following snippet for this task:

import torch.nn as nn

num_inputs = 10
num_outputs = 5
hidden_layers = (128, 256, 128)
activation = nn.ReLU

layers = (
    num_inputs,
    *hidden_layers,
    num_outputs
)

network_architecture = []
for i in range(1, len(layers)):
    network_architecture.append(nn.Linear(layers[i - 1], layers[i]))
    if i < len(layers) - 1:  # for regression tasks prevent the last layer from getting an activation function 
        network_architecture.append(activation())
    
model = nn.Sequential(*network_architecture)

The if statement prevents the output layer from getting an activation function. This is necessary when you do regression . If you want to do classification , however, you need some kind of activation function (eg softmax) there to output discrete classes.

Using a for loop in conjunction with an if statement instead of chain.from_iterable has the advantage that it is universally and intuitively understood. Furthermore, by moving the activation function out of the loop, it is configurable.

Adding the BatchNorm1d layer should be straightforward.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM