简体   繁体   中英

How to apply the parameters (weights of a network) to a vector of inputs using PyTorch?

I have a simple "Neural Network" that takes two inputs and computes a linear combination and outputs a scalar. In the case of one single equation, ie, in 1D, we would have: weight1*inputA + weight2*inputB = output . Now, I am interested in multi-dimensional inputs, ie, that inputA is a vector instead. So I want my.network to apply the weight1 to a vector, for example: weight1 * [input1, input2] + weight2 * [input3, input4] . In this setting, I would want the output to be a vector too: [out1, out2] , where out1 = weight1*inputA + weight2*inputB . However, I don't want to change the input and output dimensions of my.network. As in the 1D case, I would initialise the.network as net = LinearNet(2,1) , since we take two inputs, inputA and inputB , and get one output. I understand that the input and output itself are multi-dimensional, but this should not bother my.network.

Below is a min. working example:

import numpy as np 
import torch
from torch import nn

def f(x,a,b,c,d):
    x_next = np.zeros((2,))
    x_next[0] = x[0]*(a-b*x[1])
    x_next[1] = -x[1]*(c-d*x[0])
    return x_next #returns [2,:]

a = 1
b = 0.5
c = 1
d = 0.5
x01 = 1 #init cond. varA
x02 = 2 #init cond. varB
params = [a,b,c,d,x01,x02]

# ==================
h = 0.001
T = 10
K = int(T/h)

# forward euler approx.
x_traj_FE = np.zeros((2,K))
a, b, c, d, x_traj_FE[0,0], x_traj_FE[1,0] = params
for k in range(K-1):
    x_traj_FE[:,k+1] = x_traj_FE[:,k] + h*f(x_traj_FE[:,k],a,b,c,d)


# ==================
class LinearNet(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(LinearNet, self).__init__()

        self.layer1 = nn.Linear(input_dim, output_dim, bias=False)
        
        #no activation function

    def forward(self, x):
        x = self.layer1(x)
        #no activation
        return x

torch.manual_seed(789)
net = LinearNet(2,1)

x_traj = np.zeros((2,K))
a, b, c, d, x_traj[0,0], x_traj[1,0] = params
for k in range(K-1):
    print(net.layer1.weight)    
    input1 = x_traj[:,k]
    input2 = f(x_traj[:,k],a,b,c,d)
    inputs = torch.Tensor(np.array([input1, input2]))
    print('input:  '+str(inputs))
    print('shape:  '+str(inputs.shape))
    print('output: '+str(net(inputs)))
    break

If we run this, we will receive the following output:

Parameter containing:
tensor([[0.3871, 0.5595]], requires_grad=True)
input:  tensor([[ 1.,  2.],
        [ 0., -1.]])
shape:  torch.Size([2, 2])
output: tensor([[ 1.5060],
        [-0.5595]], grad_fn=<MmBackward0>)

Eventually, I would like to compute one forward euler step of the coupled ode f , where the input1 is the initial conditions of each equation within the system f , and input2 is the system evaluated at the previous step. If we do it manually, one step can be calculated as (given the weights from the.network above):

weights = [0.3871, 0.5595]
x_traj = np.zeros((2,K))
a, b, c, d, x_traj[0,0], x_traj[1,0] = params
for k in range(K-1):
    x_traj[:,k+1] = weights[0]*x_traj[:,k] + weights[1]*f(x_traj[:,k],a,b,c,d)
    break
x_traj

And the output is:

array([[1.    , 0.3871, 0.    , ..., 0.    , 0.    , 0.    ],
       [2.    , 0.2147, 0.    , ..., 0.    , 0.    , 0.    ]])

As we see, the output of the.network differs from the manual computation. I don't see how the.network computes this scalar-vector multiplication and hence, I can't understand how to retrieve the same output as with the manual computation.

When you do inputs = torch.Tensor(np.array([input1, input2])) , you are creating a tensor where the first row is input1 and the second row is input2 . When you are multiplying manually, your first column is equal to what was input1 previously, and similarly for input2 . In order to get equivalent results to the first case, you could change the statement in the loop to (I do not know which of the two is logically correct in your use case, however):

inp = torch.stack([x_traj[:,k], f(x_traj[:,k],a,b,c,d)],dim=0) # This gives the same matrix as `inputs` in the first case
x_traj[:,k+1] = weights[0]*inp[:,0] + weights[1]*inp[:,1] # This is equivalent to calling the network in the first case

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM