CUDA推力中的FP16支撑

Question

I am not able to found anything about the fp16 support in thrust cuda template library. 我在推力cuda模板库中找不到关于fp16支持的任何信息。 Even the roadmap page has nothing about it: https://github.com/thrust/thrust/wiki/Roadmap 甚至路线图页面也没有任何相关内容： https : //github.com/thrust/thrust/wiki/Roadmap

But I assume somebody has probably figured out how to overcome this problem, since the fp16 support in cuda is around for more than 6 month. 但是我认为有人可能已经想出了解决这个问题的方法，因为cuda的fp16支持已经存在了6个月以上。

As of today, I heavily rely on thrust in my code, and templated nearly every class I use in order to ease fp16 integration, unfortunately, absolutely nothing works out of the box for half type even this simple sample code: 从今天开始，我在很大程度上依靠我的代码，并为减轻fp16集成而对几乎使用的每个类进行了模板化，不幸的是，即使是这种简单的示例代码，对于半类型，绝对没有开箱即用的东西：

//STL
#include <iostream>
#include <cstdlib>

//Cuda
#include <cuda_runtime_api.h>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>
#include <cuda_fp16.h>
#define T half //work when float is used

int main(int argc, char* argv[])
{
        thrust::device_vector<T> a(10,1.0f);
        float t = thrust::reduce( a.cbegin(),a.cend(),(float)0);
        std::cout<<"test = "<<t<<std::endl;
        return EXIT_SUCCESS;
}

This code cannot compile because it seems that there is no implicit conversion from float to half or half to float. 该代码无法编译，因为似乎没有从float到Half或Half to float的隐式转换。 However, it seems that there are intrinsics in cuda that allow for an explicit conversion. 但是，似乎cuda中有一些内在函数可以进行显式转换。

Why can't I simply overload the half and float constructor in some header file in cuda, to add the previous intrinsic like that : 我为什么不能简单地在cuda的某些头文件中重载half和float构造函数，以添加以前的内在函数：

float::float( half a )
{
  return  __half2float( a ) ;
}

half::half( float a )
{
  return  __float2half( a ) ;
}

My question may seem basic but I don't understand why I haven't found much documentation about it. 我的问题似乎很基本，但是我不明白为什么我没有找到很多有关它的文档。

Thank you in advance 先感谢您

Answer 1

The very short answer is that what you are looking for doesn't exist. 简短的答案是，您要查找的内容不存在。

The slightly longer answer is that thrust is intended to work on fundamental and POD types only, and the CUDA fp16 half is not a POD type. 稍长一点的答案是推力仅适用于基本类型和POD类型，而CUDA fp16的half不是POD类型。 It might be possible to make two custom classes (one for the host and one for the device) which implements all the required object semantics and arithmetic operators to work correctly with thrust, but it would not be an insignificant effort to do it (and it would require writing or adapting an existing FP16 host library). 可能可以创建两个自定义类（一个用于主机，一个用于设备），该类实现所有必需的对象语义和算术运算符，以使用推力正确工作，但是这样做并不是微不足道的工作（而且需要编写或改编现有的FP16主机库）。

Note also that the current FP16 support is only in device code and only on compute 5.3 and newer devices. 还请注意，当前的FP16支持仅在设备代码中，并且仅在compute 5.3和更高版本的设备上。 So unless you have a Tegra TX1, you can't use the FP16 library in device code anyway. 因此，除非拥有Tegra TX1，否则无论如何都无法在设备代码中使用FP16库。

CUDA推力中的FP16支撑

问题描述

1 个解决方案

解决方案1
2 已采纳

CUDA推力中的FP16支撑

问题描述

1 个解决方案

解决方案1 2 已采纳

解决方案1
2 已采纳