简体   繁体   English

在C ++ 17 / C ++ 2a中编译时散列类型

[英]Hashing types at compile-time in C++17/C++2a

Consider the following code: 请考虑以下代码:

#include <iostream>
#include <type_traits>

template <class T>
constexpr std::size_t type_hash(T) noexcept 
{
    // Compute a hash for the type
    // DO SOMETHING SMART HERE
}

int main(int argc, char* argv[])
{
    auto x = []{};
    auto y = []{};
    auto z = x;
    std::cout << std::is_same_v<decltype(x), decltype(y)> << std::endl; // 0
    std::cout << std::is_same_v<decltype(x), decltype(z)> << std::endl; // 1
    constexpr std::size_t xhash = type_hash(x);
    constexpr std::size_t yhash = type_hash(y);
    constexpr std::size_t zhash = type_hash(z);
    std::cout << (xhash == yhash) << std::endl; // should be 0
    std::cout << (yhash == zhash) << std::endl; // should be 1
    return 0;
}

I would like the type_hash function to return a hash key unique to the type, at compile-time. 我希望type_hash函数在编译时返回该类型唯一的哈希键。 Is there a way to do that in C++17, or in C++2a (ideally only relying on the standard and without relying compiler intrinsics)? 有没有办法在C ++ 17或C ++ 2a中实现这一点(理想情况下只依赖于标准而不依赖编译器内在函数)?

I doubt that's possible with purely the standard C++. 我怀疑纯粹的标准C ++是否可行。


But there is a solution that will work on most major compilers (at least GCC, Clang, and MSVC). 但是有一个解决方案适用于大多数主要编译器(至少是GCC,Clang和MSVC)。 You could hash strings returned by the following function: 您可以散列以下函数返回的字符串:

template <typename T> constexpr const char *foo()
{
    #ifdef _MSC_VER
    return __FUNCSIG__;
    #else
    return __PRETTY_FUNCTION__;
    #endif
}

I don't know a way to obtain a std::size_t for the hash. 我不知道为哈希获取std::size_t的方法。

But if you accept a pointer to something, maybe you can take the address of a static member in a template class. 但是如果你接受指向某个东西的指针,也许你可以在模板类中获取静态成员的地址。

I mean... something as follows 我的意思是......如下

#include <iostream>
#include <type_traits>

template <typename>
struct type_hash
 {
   static constexpr int          i     { };
   static constexpr int const *  value { &i };
 };

template <typename T>
static constexpr auto type_hash_v = type_hash<T>::value;


int main ()
 {
   auto x = []{};
   auto y = []{};
   auto z = x;
   std::cout << std::is_same_v<decltype(x), decltype(y)> << std::endl; // 0
   std::cout << std::is_same_v<decltype(x), decltype(z)> << std::endl; // 1
   constexpr auto xhash = type_hash_v<decltype(x)>;
   constexpr auto yhash = type_hash_v<decltype(y)>;
   constexpr auto zhash = type_hash_v<decltype(z)>;
   std::cout << (xhash == yhash) << std::endl; // should be 0
   std::cout << (xhash == zhash) << std::endl; // should be 1
 } // ...........^^^^^  xhash, not yhash

If you really want type_hash as a function, I suppose you could simply create a function that return the type_hash_v<T> of the type received. 如果你真的想要将type_hash作为一个函数,我想你可以简单地创建一个返回接收类型的type_hash_v<T>的函数。

Based on HolyBlackCat answer, a constexpr template variable which is a (naive) implementation of the hash of a type: 基于HolyBlackCat的答案,一个constexpr模板变量,它是一种类型的哈希的(天真)实现:

template <typename T>
constexpr std::size_t Hash()
{
    std::size_t result{};

#ifdef _MSC_VER
#define F __FUNCSIG__
#else
#define F __PRETTY_FUNCTION__
#endif

    for (const auto &c : F)
        (result ^= c) <<= 1;

    return result;
}

template <typename T>
constexpr std::size_t constexpr_hash = Hash<T>();

Can be used as shown below: 可以使用如下所示:

constexpr auto f = constexpr_hash<float>;
constexpr auto i = constexpr_hash<int>;

Check on godbolt that the values are indeed, computed at compile time. godbolt上检查值是否确实,在编译时计算。

I don't think it is possible. 我不认为这是可能的。 "hash key unique to the type" sounds like you are looking for a perfect hash (no collisions). “类型唯一的散列键”听起来像是在寻找完美的散列(没有碰撞)。 Even if we ignore that size_t has a finite number of possible values, in general we can't know all the types because of things like shared libraries. 即使我们忽略size_t具有有限数量的可能值,通常我们也不能知道所有类型,因为共享库之类的东西。

Do you need it to persist between runs? 你需要它在运行之间坚持吗? If not, you can set up a registration scheme. 如果没有,您可以设置注册方案。

I will agree with the other answers that it's not generally possible as-stated in standard C++ yet, but we may solve a constrained version of the problem. 我同意其他答案,它们通常不可能在标准C ++中说明,但我们可以解决问题的约束版本。

Since this is all compile-time programming, we cannot have mutable state, so if you're willing to use a new variable for each state change, then something like this is possible: 由于这是所有编译时编程,我们不能有可变状态,所以如果你愿意为每个状态变化使用一个新变量,那么这样的事情是可能的:

  • hash_state1 = hash(type1) hash_state1 = hash(type1)
  • hash_state2 = hash(type2, hash_state1) hash_state2 = hash(type2,hash_state1)
  • hash_state3 = hash(type3, hash_state2) hash_state3 = hash(type3,hash_state2)

Where "hash_state" is really just a unique typelist of all the types we've hashed so far. 其中“hash_state”实际上只是我们迄今为止所有类型的唯一类型列表。 It can also provide a size_t value as a result of hashing a new type. 它还可以通过散列新类型来提供size_t值。 If a type that we seek to hash is already present in the typelist, we return the index of that type. 如果我们寻求散列的类型已经存在于类型列表中,我们将返回该类型的索引。

This requires quite a bit of boilerplate: 这需要相当多的样板:

  1. Ensuring types are unique within a typelist: I used @Deduplicator's answer here: https://stackoverflow.com/a/56259838/27678 确保类型在类型列表中是唯一的:我在这里使用了@ Deduplicator的答案: https//stackoverflow.com/a/56259838/27678
  2. Finding a type in a unique typelist 在唯一类型列表中查找类型
  3. Using if constexpr to check if a type is in the typelist (C++17) 使用if constexpr检查类型是否在类型列表中(C ++ 17)

Live Demo 现场演示


Part 1: a unique typelist: 第1部分:独特的类型列表:

Again, all credit to @Deduplicator's answer here on this part. 再次, @ Deduplicator在这一部分的答案都归于此 The following code saves compile-time performance by doing lookups on a typelist in O(log N) time thanks to leaning on the implementation of tuple-cat. 由于依赖于tuple-cat的实现,以下代码通过在O(log N)时间内对类型列表进行查找来节省编译时性能。

The code is written almost frustratingly generically, but the nice part is that it allows you to work with any generic typelist ( tuple , variant , something custom). 代码几乎令人沮丧地编写,但很好的部分是它允许您使用任何泛型类型列表( tuplevariant ,自定义)。

namespace detail {
    template <template <class...> class TT, template <class...> class UU, class... Us>
    auto pack(UU<Us...>)
    -> std::tuple<TT<Us>...>;

    template <template <class...> class TT, class... Ts>
    auto unpack(std::tuple<TT<Ts>...>)
    -> TT<Ts...>;

    template <std::size_t N, class T>
    using TET = std::tuple_element_t<N, T>;

    template <std::size_t N, class T, std::size_t... Is>
    auto remove_duplicates_pack_first(T, std::index_sequence<Is...>)
    -> std::conditional_t<(... || (N > Is && std::is_same_v<TET<N, T>, TET<Is, T>>)), std::tuple<>, std::tuple<TET<N, T>>>;

    template <template <class...> class TT, class... Ts, std::size_t... Is>
    auto remove_duplicates(std::tuple<TT<Ts>...> t, std::index_sequence<Is...> is)
    -> decltype(std::tuple_cat(remove_duplicates_pack_first<Is>(t, is)...));

    template <template <class...> class TT, class... Ts>
    auto remove_duplicates(TT<Ts...> t)
    -> decltype(unpack<TT>(remove_duplicates<TT>(pack<TT>(t), std::make_index_sequence<sizeof...(Ts)>())));
}

template <class T>
using remove_duplicates_t = decltype(detail::remove_duplicates(std::declval<T>()));

Next, I declare my own custom typelist for using the above code. 接下来,我声明自己的自定义类型列表以使用上面的代码。 A pretty straightforward empty struct that most of you have seen before: 你们大多数人之前见过的非常简单的空结构:

template<class...> struct typelist{};

Part 2: our "hash_state" 第2部分:我们的“hash_state”

"hash_state", which I'm calling hash_token : “hash_state”,我正在调用hash_token

template<size_t N, class...Ts>
struct hash_token
{
    template<size_t M, class... Us>
    constexpr bool operator ==(const hash_token<M, Us...>&)const{return N == M;}
    constexpr size_t value() const{return N;}
};

Simply encapsulates a size_t for the hash value (which you can also access via the value() function) and a comparator to check if two hash_tokens are identical (because you can have two different type lists but the same hash value. eg, if you hash int to get a token and then compare that token to one where you've hashed ( int , float , char , int )). 只需封装一个size_t作为哈希值(你也可以通过value()函数访问)和一个比较器来检查两个hash_tokens是否相同(因为你可以有两个不同的类型列表但是相同的哈希值。例如,如果你hash int获取一个令牌,然后将该令牌与你已经散列过的令牌( intfloatcharint )进行比较。

Part 3: type_hash function 第3部分: type_hash函数

Finally our type_hash function: 最后我们的type_hash函数:

template<class T, size_t N, class... Ts>
constexpr auto type_hash(T, hash_token<N, Ts...>) noexcept
{
    if constexpr(std::is_same_v<remove_duplicates_t<typelist<Ts..., T>>, typelist<Ts...>>)
    {
        return hash_token<detail::index_of<T, Ts...>(), Ts...>{};
    }
    else
    {
        return hash_token<N+1, Ts..., T>{};
    }
}

template<class T>
constexpr auto type_hash(T) noexcept
{
    return hash_token<0, T>{};
}

The first overload is for the generic case; 第一个重载是针对一般情况; you've already "hashed" a number of types, and you want to hash yet another one. 你已经“哈希”了很多类型,你想要另一个哈希。 It checks to see if the type you're hashing has already been hashed, and if so, it returns the index of the type in the unique type list. 它会检查您正在散列的类型是否已经过哈希处理,如果是,则返回唯一类型列表中类型的索引。

To accomplish getting the index of a type in a typelist, I used simple template expansion to save some compile time template instantiations (avoiding a recursive lookup): 为了在类型列表中获取类型的索引,我使用简单的模板扩展来保存一些编译时模板实例化(避免递归查找):

// find the first index of T in Ts (assuming T is in Ts)
template<class T, class... Ts>
constexpr size_t index_of()
{
    size_t index = 0;
    size_t toReturn = 0;
    using swallow = size_t[];
    (void)swallow{0, (void(std::is_same_v<T, Ts> ? toReturn = index : index), ++index)...};

    return toReturn;
}

The second overload of type_hash is for creating an initial hash_token starting at 0 . type_hash的第二个重载是用于创建从0开始的初始hash_token

Usage: 用法:

int main()
{
    auto x = []{};
    auto y = []{};
    auto z = x;
    std::cout << std::is_same_v<decltype(x), decltype(y)> << std::endl; // 0
    std::cout << std::is_same_v<decltype(x), decltype(z)> << std::endl; // 1

    constexpr auto xtoken = type_hash(x);
    constexpr auto xytoken = type_hash(y, xtoken);
    constexpr auto xyztoken = type_hash(z, xytoken);
    std::cout << (xtoken == xytoken) << std::endl; // 0
    std::cout << (xtoken == xyztoken) << std::endl; // 1
}

Conclusion: 结论:

Not really useful in a lot of code, but this may help solve some constrained meta-programming problems. 在很多代码中并不是很有用,但这可能有助于解决一些受限制的元编程问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM