I've put together some computation which I'm trying to compute a loss on the result, and compute the gradients of all the parameters of the model wrt that loss. The problem is that nestled in the computation is a tunable model that I want to be able to tune (eventually). Right now I am just trying to confirm that I can see the gradients of the model parameters when they are updated with backward()
, which I cannot, This is the problem. Below I post code, the output, and the desired output.
class ExpModelTunable(torch.nn.Module):
def __init__(self):
super(ExpModelTunable, self).__init__()
self.alpha = torch.nn.Parameter( torch.tensor(1.0, requires_grad=True) )
self.beta = torch.nn.Parameter( torch.tensor(1.0, requires_grad=True) )
def forward(self, t):
return self.alpha * torch.exp( - self.beta * t )
def func_f(t, t_list):
mu = torch.tensor(0.13191110355, requires_grad=True)
running_sum = torch.sum( torch.tensor( [ f(t-ti) for ti in t_list ], requires_grad=True ) )
return mu + running_sum
def pytorch_objective_tunable(u, t_list):
global U
steps = torch.linspace(t_list[-1].item(),u.item(),100, requires_grad=True)
func_values = torch.tensor( [ func_f(steps[i], t_list) for i in range(len(steps)) ], requires_grad=True )
return torch.log(U) + torch.trapz(func_values, steps)
def newton_method(function, func, initial, t_list, iteration=200, convergence=0.0001):
for i in range(iteration):
previous_data = initial.clone()
value = function(initial, t_list)
initial.data -= (value / func(initial.item(), t_list)).data
if torch.abs(initial - previous_data) < torch.tensor(convergence):
return initial
return initial # return our final after iteration
# call starts
f = ExpModelTunable()
U = torch.rand(1, requires_grad=True)
initial_x = torch.tensor([.1], requires_grad=True)
t_list = torch.tensor([0.0], requires_grad=True)
result = newton_method(pytorch_objective_tunable, func_f, initial_x, t_list)
print("Next Arrival at ", result.item())
This prints, the output is correct, all good here: Next Arrival at 4.500311374664307
. My problem occures here:
loss = result - torch.tensor(1)
loss.backward()
print( result.grad )
for param in f.parameters():
print(param.grad)
output:
tensor([1.])
None #this should not be None
None #this should not be None
So we can see the result variable's gradient is updating, but the model f
's parameters' gradients aren't getting updated. I tried to go back through all the computation, all the code is here, and make sure any and everything has requires_grad=True
but still I can't get it to work. This should work right? Anyone have any tips? Thanks.
There are a few issues with your code. Straight off you can tell if the model can at least initiate a backpropagation by looking at your output tensor:
>>> result
tensor([...], requires_grad=True)
It doesn't have a grad_fn
, so you already know it's not connected to a graph.
Now for debugging the issues, here are some tips:
First, you should never mutate .data
or use .item
if you're planning on backpropagating. This will essentially kill the graph! As any operation performed after won't be attached to a graph.
You actually don't need to use requires_grad
most of the time. Do note nn.Parameter
will assign requires_grad=True
to the tensor by default.
When working with list comprehensions inside your PyTorch pipeline, you can wrap the list with a torch.stack
which is very effective to keep it tidy.
I wouldn't use a global if I was you...
Here is the corrected version:
class ExpModelTunable(nn.Module):
def __init__(self):
super(ExpModelTunable, self).__init__()
self.alpha = nn.Parameter(torch.ones(1))
self.beta = nn.Parameter(torch.ones(1))
def forward(self, t):
return self.alpha * torch.exp(-self.beta*t)
f = ExpModelTunable()
def func_f(t, t_list):
mu = torch.tensor(0.13191110355)
running_sum = torch.stack([f(t-ti) for ti in t_list]).sum()
return mu + running_sum
def pytorch_objective_tunable(u, t_list):
global U
steps = torch.linspace(t_list[-1].item(), u.item(), 100)
func_values = torch.stack([func_f(steps[i], t_list) for i in range(len(steps))])
return torch.log(U) + torch.trapz(func_values, steps)
# return torch.trapz(func_values, steps)
def newton_method(function, func, initial, t_list, iteration=1, convergence=0.0001):
for i in range(iteration):
previous_data = initial.clone()
value = function(initial, t_list)
initial -= (value / func(initial, t_list))
if torch.abs(initial - previous_data) < torch.tensor(convergence):
return initial
return initial # return our final after iteration
U = torch.rand(1, requires_grad=True)
initial_x = torch.tensor([.1])
t_list = torch.tensor([0.0], requires_grad=True)
result = newton_method(pytorch_objective_tunable, func_f, initial_x, t_list)
Notice now the grad_fn
attached to result
:
>>> result
tensor([...], grad_fn=<SubBackward0>)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.