如何使用也以神经网络的前一层为食的输入层？

Question

假设我想预测双打比赛的获胜者，在某些天气条件下，一些车手通常排名更高：

Race   |Driver | Weather | Time
Dummy1 |D1     | Rain    | 2:00
Dummy1 |D2     | Rain    | 5:00
Dummy1 |D3     | Rain    | 4:50
Dummy2 |D1     | Sunny   | 3:00
Dummy2 |D2     | Sunny   | 2:50
Dummy2 |D2     | Sunny   | 2:30
...

其逻辑是，由 D1 和 D3 组成的团队在下雨时的表现会优于任何其他组合，但在其他天气时则不会有同样的运气。 话虽如此，我想到了以下model：

Layer 1          |   Layer 2             | Layer 3 (output)
Driver encoding  | weather encoding      | expected race time
----------------------------------------------------------------
Input of 0 or 1  | sum(Layer 1 * weights | sum(Layer 2 * weights)
                 |  * Input of 0 or 1)   |

这意味着第 2 层使用第 1 层以及输入值来计算值。 我想要这种架构而不是第 1 层上的每个功能的原因是我希望不同的功能相互相乘而不是它们的总和。

我找不到这样的东西，但可能只是我不知道这种方法的名称。 有人可以指出我的来源或解释知道如何在 tensorflow/pytorch/任何其他 lib 上复制它吗？

Answer 1

事实证明它实际上非常简单，对于任何可能偶然发现这篇文章并想测试这种方法的人来说，这里是粗略的代码：

赛车数据集

#         TEAM 1                TEAM 2                 "Weather"    "WON"
#         "A","B","C","D","E",  "A","B","C","D","E",   W1   W2  W3  (combined times of team 1< combined times of team 2) 
dataset=[[ 1,  1,  0,  0,  0,    0,  0,  1,  1,  0,    1,   0,  0,          1],
         [ 1,  1,  0,  0,  0,    0,  0,  1,  0,  1,    1,   0,  0,          1],
         [ 1,  1,  0,  0,  0,    0,  0,  0,  1,  1,    1,   0,  0,          1],
         [ 1,  0,  1,  0,  0,    0,  1,  0,  1,  0,    1,   0,  0,          1],
         [ 1,  0,  1,  0,  0,    0,  0,  0,  1,  1,    1,   0,  0,          0],
         [ 1,  1,  0,  0,  0,    0,  0,  0,  1,  1,    0,   1,  0,          0],
         [ 1,  1,  0,  0,  0,    0,  0,  1,  1,  0,    0,   1,  0,          0],
         [ 1,  1,  0,  0,  0,    0,  0,  1,  0,  1,    0,   1,  0,          0],
         [ 1,  0,  0,  0,  1,    0,  0,  1,  1,  0,    0,   1,  0,          0],
         [ 0,  1,  1,  0,  0,    0,  0,  0,  1,  1,    0,   1,  0,          1],
        ]

inputs=[[x[0:-4],x[-4:-1]] for x in dataset]
results=[[x[-1]] for x in dataset]

使代码更具可读性的类型

from typing import Iterator

class InputLayer():
    def __init__(self, inputs,useBias=False):
        self.inputs=inputs
        self.useBias=useBias

    def __str__(self):
        return "Layer of size "+ str(self.inputs)
    def __repr__(self) -> str:
        return self.__str__()

class InputLayerValue():
    def __init__(self, values):
        self.values=np.array(values)

实际model

import torch
from torch import nn
class MutipleInputModel(nn.Module):
    def __init__(self,input_layers:Iterator[InputLayer],output_size):
        super(MutipleInputModel, self).__init__()
        print(input_layers)
        self.nns=[]
        
        for i in range(len(input_layers)-1):
            current:InputLayer=input_layers[i]
            next:InputLayer=input_layers[i+1]
            il=nn.Linear(current.inputs,next.inputs,current.useBias)
            #To have hidden layers, you need to either use another model or create and attach multiple Linear models - nn.Linear(next.inputs,next.inputs)
            name="nn"+str(i)
            #models must be directly under self to be found by model.parameters()
            self.__setattr__(name,il)
            self.nns.append(name)
            
        il=nn.Linear(input_layers[-1].inputs,output_size,current.useBias)
        name="nnOutput"
        self.__setattr__(name,il)
        self.nns.append(name)

    def forward(self, inputs:Iterator[InputLayerValue]):
        inputsLen=len(inputs[0])
        if inputsLen != len(self.nns):
            raise Exception("Number of input values provided and input layers must be equal. Provided "+str(inputsLen)+" sets of inputs for a "+str(len(self.nns))+"-input-layer network")

        #Initialize first layer of inputs with ones which will then be multiplied by the actual input values
        lastOutput=torch.ones(len(inputs),len(inputs[0][0].values))                             # Layer 1 Outputs | Layer 2 provided Inputs | Layer 2 actual Inputs
        for i in range(inputsLen):                                                              #    lastOutput   |      multiplier         |     input
            multiplier=torch.from_numpy(np.array([x[i].values for x in inputs])).float()        #       0.2       |         0               |       0
            input=lastOutput*multiplier                                                         #       1.5       |         1               |      1.5
            lastOutput=self.__getattr__(self.nns[i])(input)                                     #       1.0       |         5               |       5 

        return lastOutput

训练

# Define hyperparameters
model = MutipleInputModel(input_layers=[InputLayer(len(x)) for x in inputs[0]],output_size=1)
n_epochs = 1000
lr=0.001
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)

for epoch in range(1, n_epochs + 1):
    optimizer.zero_grad() # Clears existing gradients from previous epoch
    output = model([[InputLayerValue(y) for y in x] for x in inputs])
    loss = criterion(output, torch.from_numpy(np.array(results)).float())
    loss.backward()
    optimizer.step() 
    
    print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
    print("Loss: {:.4f}".format(loss.item()))

测试：

def predict(model, input):
    input = [[InputLayerValue(y) for y in input]]
    out = model(input)
    return nn.Sigmoid()(out[0][0]).item()

print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [1, 0, 0]]))
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [0, 1, 0]]))
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [0, 0, 1]]))

这是一个非常基本的实现，但可以很容易地修改为具有隐藏层。

显然需要进一步测试，看看它是否真的比传统的 NN 更好，但我会说它对 NN 的可解释性非常好。

如何使用也以神经网络的前一层为食的输入层？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-12-09 00:56:18

如何使用也以神经网络的前一层为食的输入层？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-12-09 00:56:18

解决方案1
0 已采纳 2022-12-09 00:56:18