简体   繁体   English

pytorch 如何为量化线性层实现前向?

[英]How pytorch implement forward for a quantized linear layer?

I have a quantized model in pytorch and now I want to extract the parameter of the quantized linear layer and implement the forward manually.我在 pytorch 中有一个量化的 model,现在我想提取量化线性层的参数并手动实现前向。 I search the source code but only find this function.我搜索了源代码,但只找到了这个 function。

def forward(self, x: torch.Tensor) -> torch.Tensor:
    return torch.ops.quantized.linear(
    x, self._packed_params._packed_params, self.scale, self.zero_point)

But no where I can find how torch.ops.quantized.linear is defined.但是我找不到如何定义 torch.ops.quantized.linear 的地方。

Can someone give me a hind how the forward of quantized linear are defined?有人可以告诉我量化线性的前向是如何定义的吗?

In answer to the question of where torch.ops.quantized.linear , I was looking for the same thing but was never able to find it.在回答 where torch.ops.quantized.linear的问题时,我一直在寻找同样的东西,但一直没能找到。 I believe it's probably somewhere in the aten (C++ namespace).我相信它可能在aten (C++ 命名空间)中的某个地方。 I did, however, find some useful PyTorch-based implementations in the NVIDIA TensorRT repo below.但是,我确实在下面的 NVIDIA TensorRT 存储库中找到了一些有用的基于 PyTorch 的实现。 It's quite possible these are the ones actually called by PyTorch via some DLLs.这些很可能是 PyTorch 通过某些 DLL 实际调用的那些。 If you're trying to add quantization to a custom layer, these implementations walk you through it.如果您尝试将量化添加到自定义层,这些实现将引导您完成它。

You can find the docs here and the GitHub page here .你可以在这里找到文档GitHub 页面在这里

For the linear layer specifically, see the QuantLinear layer here对于线性层,请参见此处的 QuantLinear 层

Under the hood, this calls TensorQuantFunction.apply() for post-training quantization or FakeTensorQuantFunction.apply() for quantization-aware training.在后台,这调用TensorQuantFunction.apply()进行训练后量化或FakeTensorQuantFunction.apply()进行量化感知训练。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM