简体   繁体   English

用嵌套梯度带计算 Hessian 矩阵

[英]Calculating Hessian matrix with nested gradient tape

I am attempting to calculate the Hessian matrix of the output of a neural network with respect to the input, using gradient tapes and 2 batch_jacobian functions.我正在尝试使用梯度磁带和 2 个 batch_jacobian 函数来计算神经网络输出相对于输入的 Hessian 矩阵。 The first function returns the Jacobian, as expected, the second one returns a pretty meaningless error.正如预期的那样,第一个函数返回雅可比行列式,第二个函数返回一个毫无意义的错误。 The coordinates [:,0] in g are picked, since the Jacobian is calculated with respect to just one variable, hence the second axis only has length 1. The code does not execute past the point where the Hessian is calculated.选取 g 中的坐标 [:,0],因为雅可比矩阵仅针对一个变量进行计算,因此第二个轴的长度仅为 1。代码不会在计算 Hessian 的点之后执行。

 def dynamics(self, x):  # compute guess of dynamics of the system, using current lagrangian model
    with tf.GradientTape() as tape2: # observing the first order derivatives
        tape2.watch(x)
        with tf.GradientTape(persistent = True) as tape:
            tape.watch(x)
            lagrangian = self.lagrangian_net(x)
            print(lagrangian)
        g = tape.batch_jacobian(lagrangian, x, unconnected_gradients='zero')[:, 0]
        print(g)
    hessian = tape2.batch_jacobian(g, x, unconnected_gradients='zero')
    print(hessian)
    U = g[:, 0, :] - tf.einsum("dij,dj->di", hessian[:, 1, :, 0, :], x[:, 1, :])  # U[d,i]
    P = hessian[:, 1, :, 1, :]
    P = tf.map_fn(tf.linalg.inv, P)  # P[d, i, k]
    A = tf.einsum("di,dik->dk", U, P)
    return A  # return accelerations for the batch

Error:错误:

Traceback (most recent call last): File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\backprop.py", line 1183, in batch_jacobian parallel_iterations=parallel_iterations) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for\\control_flow_ops.py", line 164, in pfor return f() File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 1323, in call graph_function, args, kwargs = self._maybe_define_function(args, kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 1652, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 1545, in _create_graph_function capture_by_value=self._capture_by_value), File "C:\\Users\\Maks\\Anaconda3\\envs\\machi回溯(最近一次调用):文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\backprop.py”,第 1183 行,batch_jacobian parallel_iterations=parallel_iterations)文件"C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for\\control_flow_ops.py", line 164, in pfor return f() File "C:\\Users\\Maks \\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py”,第 1323 行,在调用graph_function, args, kwargs = self._maybe_define_function(args, kwargs) 文件“C:\\Users\\ Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py”,第 1652 行,在 _maybe_define_function graph_function = self._create_graph_function(args, kwargs) 文件“C:\\Users\\Maks\\Anaconda3 \\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py”,第 1545 行,在 _create_graph_function capture_by_value=self._capture_by_value),文件“C:\\Users\\Maks\\Anaconda3\\envs\\machi ne_learning\\lib\\site-packages\\tensorflow\\python\\framework\\func_graph.py", line 715, in func_graph_from_py_func func_outputs = python_func(*func_args, **func_kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\framework\\func_graph.py", line 705, in wrapper raise e.ag_error_metadata.to_exception(type(e)) ValueError: in converted code: relative to C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for: ne_learning\\lib\\site-packages\\tensorflow\\python\\framework\\func_graph.py”,第 715 行,在 func_graph_from_py_func func_outputs = python_func(*func_args, **func_kwargs) 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\ lib\\site-packages\\tensorflow\\python\\framework\\func_graph.py",第 705 行,在包装器中引发 e.ag_error_metadata.to_exception(type(e)) ValueError:在转换后的代码中:相对于 C:\\Users\\Maks\\Anaconda3 \\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for:

control_flow_ops.py:161 f *
    return _pfor_impl(loop_fn, iters, parallel_iterations=parallel_iterations)
control_flow_ops.py:214 _pfor_impl
    outputs.append(converter.convert(loop_fn_output))
pfor.py:1175 convert
    output = self._convert_helper(y)
pfor.py:1343 _convert_helper
    "which may run slower" % (y_op.type, y_op, converted_inputs))

ValueError: No converter defined for PartitionedCall
name: "loop_body/PartitionedCall"
op: "PartitionedCall"
input: "loop_body/Reshape_4"
input: "loop_body/PartitionedCall/args_1"
input: "loop_body/PartitionedCall/args_2"
input: "loop_body/PartitionedCall/args_3"
input: "loop_body/PartitionedCall/args_4"
input: "loop_body/PartitionedCall/args_5"
attr {
  key: "Tin"
  value {
    list {
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_INT32
    }
  }
}
attr {
  key: "Tout"
  value {
    list {
      type: DT_DOUBLE
      type: DT_DOUBLE
    }
  }
}
attr {
  key: "_gradient_op_type"
  value {
    s: "PartitionedCall-385"
  }
}
attr {
  key: "config"
  value {
    s: ""
  }
}
attr {
  key: "config_proto"
  value {
    s: "\n\007\n\003CPU\020\001\n\007\n\003GPU\020\0002\002J\0008\001"
  }
}
attr {
  key: "executor_type"
  value {
    s: ""
  }
}
attr {
  key: "f"
  value {
    func {
      name: "__inference___backward_f_232_270"
    }
  }
}

inputs: [WrappedTensor(t=<tf.Tensor 'loop_body/Reshape_4/pfor/Reshape:0' shape=(4, 1, 3, 2, 2) dtype=float64>, is_stacked=True, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_1:0' shape=(4, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_2:0' shape=(3, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_3:0' shape=(1, 3, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_4:0' shape=(1, 3, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_5:0' shape=(3,) dtype=int32>, is_stacked=False, is_sparse_stacked=False)]. 
Either add a converter or set --op_conversion_fallback_to_while_loop=True, which may run slower

During handling of the above exception, another exception occurred:在处理上述异常的过程中,又发生了一个异常:

Traceback (most recent call last): File "C:/Users/Maks/Desktop/neural/lagrangian neural network.py", line 129, in N.train(2) File "C:/Users/Maks/Desktop/neural/lagrangian neural network.py", line 110, in train self.train_step(x, true_y) File "C:/Users/Maks/Desktop/neural/lagrangian neural network.py", line 98, in train_step acc = self.dynamics(x) File "C:/Users/Maks/Desktop/neural/lagrangian neural network.py", line 83, in dynamics hessian = tape2.batch_jacobian(g, x, unconnected_gradients='zero') File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\backprop.py", line 1191, in batch_jacobian sys.exc_info()[2]) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\six.py", line 702, in reraise raise value.with_traceback(tb) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\backprop.py", line 1183, in batch_jacobian parallel_iterations=parallel_iterations) File "C:\\Users\\Maks\\Anaconda3\\en回溯(最近一次调用最后一次):文件“C:/Users/Maks/Desktop/neural/lagrangian neural network.py”,第 129 行,在 N.train(2) 文件“C:/Users/Maks/Desktop/neural /lagrangian neural network.py”,第 110 行,在 train self.train_step(x, true_y) 文件“C:/Users/Maks/Desktop/neural/lagrangian neural network.py”,第 98 行,在 train_step acc = self. dynamics(x) 文件“C:/Users/Maks/Desktop/neural/lagrangian neural network.py”,第 83 行,在动力学中 hessian = tape2.batch_jacobian(g, x, unconnected_gradients='zero') 文件“C:\\ Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\backprop.py”,第 1191 行,batch_jacobian sys.exc_info()[2]) 文件“C:\\Users\\Maks\\ Anaconda3\\envs\\machine_learning\\lib\\site-packages\\six.py", line 702, in reraise raise value.with_traceback(tb) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\ tensorflow\\python\\eager\\backprop.py", line 1183, in batch_jacobian parallel_iterations=parallel_iterations) 文件 "C:\\Users\\Maks\\Anaconda3\\en vs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for\\control_flow_ops.py", line 164, in pfor return f() File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 1323, in call graph_function, args, kwargs = self._maybe_define_function(args, kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 1652, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 1545, in _create_graph_function capture_by_value=self._capture_by_value), File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\framework\\func_graph.py", line 715, in func_graph_from_py_func func_outputs = python_func(*func_args, **func_kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow vs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for\\control_flow_ops.py”,第 164 行,在 pfor 中返回 f() 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site -packages\\tensorflow\\python\\eager\\function.py", line 1323, in call graph_function, args, kwargs = self._maybe_define_function(args, kwargs) File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\ site-packages\\tensorflow\\python\\eager\\function.py”,第 1652 行,在 _maybe_define_function graph_function = self._create_graph_function(args, kwargs) 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages \\tensorflow\\python\\eager\\function.py”,第 1545 行,在 _create_graph_function capture_by_value=self._capture_by_value),文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\framework \\func_graph.py”,第 715 行,在 func_graph_from_py_func func_outputs = python_func(*func_args, **func_kwargs) 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow \\python\\framework\\func_graph.py", line 705, in wrapper raise e.ag_error_metadata.to_exception(type(e)) ValueError: in converted code: relative to C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\ops\\parallel_for: \\python\\framework\\func_graph.py",第 705 行,在包装器中引发 e.ag_error_metadata.to_exception(type(e)) ValueError:在转换后的代码中:相对于 C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\站点包\\tensorflow\\python\\ops\\parallel_for:

control_flow_ops.py:161 f *
    return _pfor_impl(loop_fn, iters, parallel_iterations=parallel_iterations)
control_flow_ops.py:214 _pfor_impl
    outputs.append(converter.convert(loop_fn_output))
pfor.py:1175 convert
    output = self._convert_helper(y)
pfor.py:1343 _convert_helper
    "which may run slower" % (y_op.type, y_op, converted_inputs))

ValueError: No converter defined for PartitionedCall
name: "loop_body/PartitionedCall"
op: "PartitionedCall"
input: "loop_body/Reshape_4"
input: "loop_body/PartitionedCall/args_1"
input: "loop_body/PartitionedCall/args_2"
input: "loop_body/PartitionedCall/args_3"
input: "loop_body/PartitionedCall/args_4"
input: "loop_body/PartitionedCall/args_5"
attr {
  key: "Tin"
  value {
    list {
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_DOUBLE
      type: DT_INT32
    }
  }
}
attr {
  key: "Tout"
  value {
    list {
      type: DT_DOUBLE
      type: DT_DOUBLE
    }
  }
}
attr {
  key: "_gradient_op_type"
  value {
    s: "PartitionedCall-385"
  }
}
attr {
  key: "config"
  value {
    s: ""
  }
}
attr {
  key: "config_proto"
  value {
    s: "\n\007\n\003CPU\020\001\n\007\n\003GPU\020\0002\002J\0008\001"
  }
}
attr {
  key: "executor_type"
  value {
    s: ""
  }
}
attr {
  key: "f"
  value {
    func {
      name: "__inference___backward_f_232_270"
    }
  }
}

inputs: [WrappedTensor(t=<tf.Tensor 'loop_body/Reshape_4/pfor/Reshape:0' shape=(4, 1, 3, 2, 2) dtype=float64>, is_stacked=True, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_1:0' shape=(4, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_2:0' shape=(3, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_3:0' shape=(1, 3, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_4:0' shape=(1, 3, 1) dtype=float64>, is_stacked=False, is_sparse_stacked=False), WrappedTensor(t=<tf.Tensor 'loop_body/PartitionedCall/args_5:0' shape=(3,) dtype=int32>, is_stacked=False, is_sparse_stacked=False)]. 
Either add a converter or set --op_conversion_fallback_to_while_loop=True, which may run slower

Encountered an exception while vectorizing the batch_jacobian computation.在矢量化 batch_jacobian 计算时遇到异常。 Vectorization can be disabled by setting experimental_use_pfor to False.可以通过将experimental_use_pfor 设置为False 来禁用矢量化。 Exception ignored in: <function _EagerDefinedFunctionDeleter.异常被忽略:<function _EagerDefinedFunctionDeleter。 del at 0x0000026FE02C50D8> Traceback (most recent call last): File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 305, in del File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 1663, in remove_function File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 865, in remove_function TypeError: 'NoneType' object is not callable Exception ignored in: <function _EagerDefinedFunctionDeleter. del at 0x0000026FE02C50D8> Traceback(最近一次调用):文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py”,第305行,在del文件中“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py”,第 1663 行,在 remove_function 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning”中\\lib\\site-packages\\tensorflow\\python\\eager\\context.py”,第 865 行,在 remove_function TypeError: 'NoneType' object is not callable Exception异常被忽略:<function _EagerDefinedFunctionDeleter。 del at 0x0000026FE02C50D8> Traceback (most recent call last): File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 305, in del File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 1663, in remove_function File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 865, in remove_function TypeError: 'NoneType' object is not callable Exception ignored in: <function _EagerDefinedFunctionDeleter. del at 0x0000026FE02C50D8> Traceback(最近一次调用):文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py”,第305行,在del文件中“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py”,第 1663 行,在 remove_function 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning”中\\lib\\site-packages\\tensorflow\\python\\eager\\context.py”,第 865 行,在 remove_function TypeError: 'NoneType' object is not callable Exception异常被忽略:<function _EagerDefinedFunctionDeleter。 del at 0x0000026FE02C50D8> Traceback (most recent call last): File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py", line 305, in del File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 1663, in remove_function File "C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 865, in remove_function TypeError: 'NoneType' object is not callable del at 0x0000026FE02C50D8> Traceback(最近一次调用):文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\function.py”,第305行,在del文件中“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning\\lib\\site-packages\\tensorflow\\python\\eager\\context.py”,第 1663 行,在 remove_function 文件“C:\\Users\\Maks\\Anaconda3\\envs\\machine_learning”中\\lib\\site-packages\\tensorflow\\python\\eager\\context.py", line 865, in remove_function TypeError: 'NoneType' object is not callable

通过更新到 tensorflow 2.1 解决

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM