简体   繁体   中英

How to pass input dim from fit method to skorch wrapper?

I am trying to incorporate PyTorch functionalities into a scikit-learn environment (in particular Pipelines and GridSearchCV) and therefore have been looking into skorch . The standard documentation example for neural networks looks like

import torch.nn.functional as F
from torch import nn
from skorch import NeuralNetClassifier

class MyModule(nn.Module):
    def __init__(self, num_units=10, nonlin=F.relu):
        super(MyModule, self).__init__()

        self.dense0 = nn.Linear(20, num_units)
        self.nonlin = nonlin
        self.dropout = nn.Dropout(0.5)
        ...
        ...
        self.output = nn.Linear(10, 2)
    ...
    ...

where you explicitly pass the input and output dimensions by hardcoding them into the constructor. However, this is not really how scikit-learn interfaces work, where the input and output dimensions are derived by the fit method rather than being explicitly passed to the constructors. As a practical example consider

# copied from the documentation
net = NeuralNetClassifier(
    MyModule,
    max_epochs=10,
    lr=0.1,
    # Shuffle training data on each epoch
    iterator_train__shuffle=True,
)

# any general Pipeline interface
pipeline = Pipeline([
        ('transformation', AnyTransformer()),
        ('net', net)
        ])

gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy')
gs.fit(X, y)

besides the fact that nowhere in the transformers must one specify the input and output dimensions, the transformers that are applied before the model may change the dimentionality of the training set (think at dimensionality reductions and similar), therefore hardcoding input and output in the neural network constructor just will not do.

Did I misunderstand how this is supposed to work or otherwise what would be a suggested solution (I was thinking of specifying the constructors into the forward method where you do have X available for fit already, but I am not sure this is good practice)?

This is a very good question and I'm afraid that there is best practice answer to this as PyTorch is normally written in a way where initialization and execution are separate steps which is exactly what you don't want in this case.

There are several ways forward which are all going in the same direction, namely introspecting the input data and re-initializing the network before fitting. The simplest way I can think of is writing a callback that sets the corresponding parameters during training begin:

class InputShapeSetter(skorch.callbacks.Callback):
    def on_train_begin(self, net, X, y):
        net.set_params(module__input_dim=X.shape[-1])

This sets a module parameter during training begin which will re-initialize the PyTorch module with said parameter. This specific callback expects that the parameter for the first layer is called input_dim but you can change this if you want.

A full example:

import torch
import skorch
from sklearn.datasets import make_classification
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA

X, y = make_classification()
X = X.astype('float32')

class ClassifierModule(torch.nn.Module):
    def __init__(self, input_dim=80):
        super().__init__()
        self.l0 = torch.nn.Linear(input_dim, 10)
        self.l1 = torch.nn.Linear(10, 2)

    def forward(self, X):
        y = self.l0(X)
        y = self.l1(y)
        return torch.softmax(y, dim=-1)


class InputShapeSetter(skorch.callbacks.Callback):
    def on_train_begin(self, net, X, y):
        net.set_params(module__input_dim=X.shape[-1])


net = skorch.NeuralNetClassifier(
    ClassifierModule,
    callbacks=[InputShapeSetter()],
)

pipe = Pipeline([
    ('pca', PCA(n_components=10)),
    ('net', net),
])

pipe.fit(X, y)
print(pipe.predict(X))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM