简体   繁体   中英

Hashing types at compile-time in C++17/C++2a

Consider the following code:

#include <iostream>
#include <type_traits>

template <class T>
constexpr std::size_t type_hash(T) noexcept 
{
    // Compute a hash for the type
    // DO SOMETHING SMART HERE
}

int main(int argc, char* argv[])
{
    auto x = []{};
    auto y = []{};
    auto z = x;
    std::cout << std::is_same_v<decltype(x), decltype(y)> << std::endl; // 0
    std::cout << std::is_same_v<decltype(x), decltype(z)> << std::endl; // 1
    constexpr std::size_t xhash = type_hash(x);
    constexpr std::size_t yhash = type_hash(y);
    constexpr std::size_t zhash = type_hash(z);
    std::cout << (xhash == yhash) << std::endl; // should be 0
    std::cout << (yhash == zhash) << std::endl; // should be 1
    return 0;
}

I would like the type_hash function to return a hash key unique to the type, at compile-time. Is there a way to do that in C++17, or in C++2a (ideally only relying on the standard and without relying compiler intrinsics)?

I doubt that's possible with purely the standard C++.


But there is a solution that will work on most major compilers (at least GCC, Clang, and MSVC). You could hash strings returned by the following function:

template <typename T> constexpr const char *foo()
{
    #ifdef _MSC_VER
    return __FUNCSIG__;
    #else
    return __PRETTY_FUNCTION__;
    #endif
}

I don't know a way to obtain a std::size_t for the hash.

But if you accept a pointer to something, maybe you can take the address of a static member in a template class.

I mean... something as follows

#include <iostream>
#include <type_traits>

template <typename>
struct type_hash
 {
   static constexpr int          i     { };
   static constexpr int const *  value { &i };
 };

template <typename T>
static constexpr auto type_hash_v = type_hash<T>::value;


int main ()
 {
   auto x = []{};
   auto y = []{};
   auto z = x;
   std::cout << std::is_same_v<decltype(x), decltype(y)> << std::endl; // 0
   std::cout << std::is_same_v<decltype(x), decltype(z)> << std::endl; // 1
   constexpr auto xhash = type_hash_v<decltype(x)>;
   constexpr auto yhash = type_hash_v<decltype(y)>;
   constexpr auto zhash = type_hash_v<decltype(z)>;
   std::cout << (xhash == yhash) << std::endl; // should be 0
   std::cout << (xhash == zhash) << std::endl; // should be 1
 } // ...........^^^^^  xhash, not yhash

If you really want type_hash as a function, I suppose you could simply create a function that return the type_hash_v<T> of the type received.

Based on HolyBlackCat answer, a constexpr template variable which is a (naive) implementation of the hash of a type:

template <typename T>
constexpr std::size_t Hash()
{
    std::size_t result{};

#ifdef _MSC_VER
#define F __FUNCSIG__
#else
#define F __PRETTY_FUNCTION__
#endif

    for (const auto &c : F)
        (result ^= c) <<= 1;

    return result;
}

template <typename T>
constexpr std::size_t constexpr_hash = Hash<T>();

Can be used as shown below:

constexpr auto f = constexpr_hash<float>;
constexpr auto i = constexpr_hash<int>;

Check on godbolt that the values are indeed, computed at compile time.

I don't think it is possible. "hash key unique to the type" sounds like you are looking for a perfect hash (no collisions). Even if we ignore that size_t has a finite number of possible values, in general we can't know all the types because of things like shared libraries.

Do you need it to persist between runs? If not, you can set up a registration scheme.

I will agree with the other answers that it's not generally possible as-stated in standard C++ yet, but we may solve a constrained version of the problem.

Since this is all compile-time programming, we cannot have mutable state, so if you're willing to use a new variable for each state change, then something like this is possible:

  • hash_state1 = hash(type1)
  • hash_state2 = hash(type2, hash_state1)
  • hash_state3 = hash(type3, hash_state2)

Where "hash_state" is really just a unique typelist of all the types we've hashed so far. It can also provide a size_t value as a result of hashing a new type. If a type that we seek to hash is already present in the typelist, we return the index of that type.

This requires quite a bit of boilerplate:

  1. Ensuring types are unique within a typelist: I used @Deduplicator's answer here: https://stackoverflow.com/a/56259838/27678
  2. Finding a type in a unique typelist
  3. Using if constexpr to check if a type is in the typelist (C++17)

Live Demo


Part 1: a unique typelist:

Again, all credit to @Deduplicator's answer here on this part. The following code saves compile-time performance by doing lookups on a typelist in O(log N) time thanks to leaning on the implementation of tuple-cat.

The code is written almost frustratingly generically, but the nice part is that it allows you to work with any generic typelist ( tuple , variant , something custom).

namespace detail {
    template <template <class...> class TT, template <class...> class UU, class... Us>
    auto pack(UU<Us...>)
    -> std::tuple<TT<Us>...>;

    template <template <class...> class TT, class... Ts>
    auto unpack(std::tuple<TT<Ts>...>)
    -> TT<Ts...>;

    template <std::size_t N, class T>
    using TET = std::tuple_element_t<N, T>;

    template <std::size_t N, class T, std::size_t... Is>
    auto remove_duplicates_pack_first(T, std::index_sequence<Is...>)
    -> std::conditional_t<(... || (N > Is && std::is_same_v<TET<N, T>, TET<Is, T>>)), std::tuple<>, std::tuple<TET<N, T>>>;

    template <template <class...> class TT, class... Ts, std::size_t... Is>
    auto remove_duplicates(std::tuple<TT<Ts>...> t, std::index_sequence<Is...> is)
    -> decltype(std::tuple_cat(remove_duplicates_pack_first<Is>(t, is)...));

    template <template <class...> class TT, class... Ts>
    auto remove_duplicates(TT<Ts...> t)
    -> decltype(unpack<TT>(remove_duplicates<TT>(pack<TT>(t), std::make_index_sequence<sizeof...(Ts)>())));
}

template <class T>
using remove_duplicates_t = decltype(detail::remove_duplicates(std::declval<T>()));

Next, I declare my own custom typelist for using the above code. A pretty straightforward empty struct that most of you have seen before:

template<class...> struct typelist{};

Part 2: our "hash_state"

"hash_state", which I'm calling hash_token :

template<size_t N, class...Ts>
struct hash_token
{
    template<size_t M, class... Us>
    constexpr bool operator ==(const hash_token<M, Us...>&)const{return N == M;}
    constexpr size_t value() const{return N;}
};

Simply encapsulates a size_t for the hash value (which you can also access via the value() function) and a comparator to check if two hash_tokens are identical (because you can have two different type lists but the same hash value. eg, if you hash int to get a token and then compare that token to one where you've hashed ( int , float , char , int )).

Part 3: type_hash function

Finally our type_hash function:

template<class T, size_t N, class... Ts>
constexpr auto type_hash(T, hash_token<N, Ts...>) noexcept
{
    if constexpr(std::is_same_v<remove_duplicates_t<typelist<Ts..., T>>, typelist<Ts...>>)
    {
        return hash_token<detail::index_of<T, Ts...>(), Ts...>{};
    }
    else
    {
        return hash_token<N+1, Ts..., T>{};
    }
}

template<class T>
constexpr auto type_hash(T) noexcept
{
    return hash_token<0, T>{};
}

The first overload is for the generic case; you've already "hashed" a number of types, and you want to hash yet another one. It checks to see if the type you're hashing has already been hashed, and if so, it returns the index of the type in the unique type list.

To accomplish getting the index of a type in a typelist, I used simple template expansion to save some compile time template instantiations (avoiding a recursive lookup):

// find the first index of T in Ts (assuming T is in Ts)
template<class T, class... Ts>
constexpr size_t index_of()
{
    size_t index = 0;
    size_t toReturn = 0;
    using swallow = size_t[];
    (void)swallow{0, (void(std::is_same_v<T, Ts> ? toReturn = index : index), ++index)...};

    return toReturn;
}

The second overload of type_hash is for creating an initial hash_token starting at 0 .

Usage:

int main()
{
    auto x = []{};
    auto y = []{};
    auto z = x;
    std::cout << std::is_same_v<decltype(x), decltype(y)> << std::endl; // 0
    std::cout << std::is_same_v<decltype(x), decltype(z)> << std::endl; // 1

    constexpr auto xtoken = type_hash(x);
    constexpr auto xytoken = type_hash(y, xtoken);
    constexpr auto xyztoken = type_hash(z, xytoken);
    std::cout << (xtoken == xytoken) << std::endl; // 0
    std::cout << (xtoken == xyztoken) << std::endl; // 1
}

Conclusion:

Not really useful in a lot of code, but this may help solve some constrained meta-programming problems.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM