简体   繁体   English

如何使用也以神经网络的前一层为食的输入层?

[英]How to use an input layer that also feeds on a previous layer of a neural network?

Let's say I want to predict the winner of a tag-team race, where some drivers are more usually place higher in certain weather conditions:假设我想预测双打比赛的获胜者,在某些天气条件下,一些车手通常排名更高:

Race   |Driver | Weather | Time
Dummy1 |D1     | Rain    | 2:00
Dummy1 |D2     | Rain    | 5:00
Dummy1 |D3     | Rain    | 4:50
Dummy2 |D1     | Sunny   | 3:00
Dummy2 |D2     | Sunny   | 2:50
Dummy2 |D2     | Sunny   | 2:30
...

The logic is that a team composed of D1 and D3 would outperform any other combination on Rain, but wouldn't have the same luck on other weather.其逻辑是,由 D1 和 D3 组成的团队在下雨时的表现会优于任何其他组合,但在其他天气时则不会有同样的运气。 With that said, I thought about the following model:话虽如此,我想到了以下model:

Layer 1          |   Layer 2             | Layer 3 (output)
Driver encoding  | weather encoding      | expected race time
----------------------------------------------------------------
Input of 0 or 1  | sum(Layer 1 * weights | sum(Layer 2 * weights)
                 |  * Input of 0 or 1)   | 

This means that layer 2 uses layer 1 as well as input values to compute a value.这意味着第 2 层使用第 1 层以及输入值来计算值。 The reason I want this architecture instead of having every feature on layer 1 is that I want different features to multiply each other instead of their sum.我想要这种架构而不是第 1 层上的每个功能的原因是我希望不同的功能相互相乘而不是它们的总和。

I could not find anything like this, but it is probably just me not knowing the name of this approach.我找不到这样的东西,但可能只是我不知道这种方法的名称。 Can someone point me to sources or explain know how to replicate this on tensorflow/pytorch/any other lib?有人可以指出我的来源或解释知道如何在 tensorflow/pytorch/任何其他 lib 上复制它吗?

Turns out it was actually pretty simple, for anyone that might stumble upon this post and would like to test this approach, here's rough code:事实证明它实际上非常简单,对于任何可能偶然发现这篇文章并想测试这种方法的人来说,这里是粗略的代码:

Racing dataset赛车数据集

#         TEAM 1                TEAM 2                 "Weather"    "WON"
#         "A","B","C","D","E",  "A","B","C","D","E",   W1   W2  W3  (combined times of team 1< combined times of team 2) 
dataset=[[ 1,  1,  0,  0,  0,    0,  0,  1,  1,  0,    1,   0,  0,          1],
         [ 1,  1,  0,  0,  0,    0,  0,  1,  0,  1,    1,   0,  0,          1],
         [ 1,  1,  0,  0,  0,    0,  0,  0,  1,  1,    1,   0,  0,          1],
         [ 1,  0,  1,  0,  0,    0,  1,  0,  1,  0,    1,   0,  0,          1],
         [ 1,  0,  1,  0,  0,    0,  0,  0,  1,  1,    1,   0,  0,          0],
         [ 1,  1,  0,  0,  0,    0,  0,  0,  1,  1,    0,   1,  0,          0],
         [ 1,  1,  0,  0,  0,    0,  0,  1,  1,  0,    0,   1,  0,          0],
         [ 1,  1,  0,  0,  0,    0,  0,  1,  0,  1,    0,   1,  0,          0],
         [ 1,  0,  0,  0,  1,    0,  0,  1,  1,  0,    0,   1,  0,          0],
         [ 0,  1,  1,  0,  0,    0,  0,  0,  1,  1,    0,   1,  0,          1],
        ]

inputs=[[x[0:-4],x[-4:-1]] for x in dataset]
results=[[x[-1]] for x in dataset]

Typings to make code more readable使代码更具可读性的类型

from typing import Iterator

class InputLayer():
    def __init__(self, inputs,useBias=False):
        self.inputs=inputs
        self.useBias=useBias

    def __str__(self):
        return "Layer of size "+ str(self.inputs)
    def __repr__(self) -> str:
        return self.__str__()

class InputLayerValue():
    def __init__(self, values):
        self.values=np.array(values)

Actual model实际model

import torch
from torch import nn
class MutipleInputModel(nn.Module):
    def __init__(self,input_layers:Iterator[InputLayer],output_size):
        super(MutipleInputModel, self).__init__()
        print(input_layers)
        self.nns=[]
        
        for i in range(len(input_layers)-1):
            current:InputLayer=input_layers[i]
            next:InputLayer=input_layers[i+1]
            il=nn.Linear(current.inputs,next.inputs,current.useBias)
            #To have hidden layers, you need to either use another model or create and attach multiple Linear models - nn.Linear(next.inputs,next.inputs)
            name="nn"+str(i)
            #models must be directly under self to be found by model.parameters()
            self.__setattr__(name,il)
            self.nns.append(name)
            
        il=nn.Linear(input_layers[-1].inputs,output_size,current.useBias)
        name="nnOutput"
        self.__setattr__(name,il)
        self.nns.append(name)

    def forward(self, inputs:Iterator[InputLayerValue]):
        inputsLen=len(inputs[0])
        if inputsLen != len(self.nns):
            raise Exception("Number of input values provided and input layers must be equal. Provided "+str(inputsLen)+" sets of inputs for a "+str(len(self.nns))+"-input-layer network")

        #Initialize first layer of inputs with ones which will then be multiplied by the actual input values
        lastOutput=torch.ones(len(inputs),len(inputs[0][0].values))                             # Layer 1 Outputs | Layer 2 provided Inputs | Layer 2 actual Inputs
        for i in range(inputsLen):                                                              #    lastOutput   |      multiplier         |     input
            multiplier=torch.from_numpy(np.array([x[i].values for x in inputs])).float()        #       0.2       |         0               |       0
            input=lastOutput*multiplier                                                         #       1.5       |         1               |      1.5
            lastOutput=self.__getattr__(self.nns[i])(input)                                     #       1.0       |         5               |       5 

        return lastOutput

Training训练

# Define hyperparameters
model = MutipleInputModel(input_layers=[InputLayer(len(x)) for x in inputs[0]],output_size=1)
n_epochs = 1000
lr=0.001
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)

for epoch in range(1, n_epochs + 1):
    optimizer.zero_grad() # Clears existing gradients from previous epoch
    output = model([[InputLayerValue(y) for y in x] for x in inputs])
    loss = criterion(output, torch.from_numpy(np.array(results)).float())
    loss.backward()
    optimizer.step() 
    
    print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
    print("Loss: {:.4f}".format(loss.item()))

Testing:测试:

def predict(model, input):
    input = [[InputLayerValue(y) for y in input]]
    out = model(input)
    return nn.Sigmoid()(out[0][0]).item()

print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [1, 0, 0]]))
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [0, 1, 0]]))
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [0, 0, 1]]))

This is a really basic implementation, but could easily be modified to have hidden layers.这是一个非常基本的实现,但可以很容易地修改为具有隐藏层。

Clearly needs further testing to see if it is actually better than a traditional NN, but I would say it is great for NN explainability.显然需要进一步测试,看看它是否真的比传统的 NN 更好,但我会说它对 NN 的可解释性非常好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM