简体   繁体   中英

Defining a c++20 concept for hash functions

The CppCoreGuidelines state that concepts should be specified for all template arguments. (see: T.10: Specify concepts for all template arguments ) As practice for defining concepts, I am trying to build a hashtable with concepts defined for the Hash function and Key template arguments.

I want my hashtable to use two template arguments, HashFunc and Key . HashFunc should be a function object and Key should be the argument for the function object HashFunc .

That is, HashFunc(Key) should return a type convertible to size_t .

On cppreference , there is an example defining the concept Hashable . I replicated the example below:

template<typename T>
concept Hashable = requires(T a) {
    { std::hash<T>{}(a) } -> std::convertible_to<std::size_t>;
};

This Hashable concept makes sense for many uses. In these cases, hash functions on object of type T are specializations std::hash<T> . However, for my purposes, I don't want to assume that the Hash will be std::hash<Key> . I would like the user to be able to provide a different hash function.

Since HashFunc and Key are so tightly bound, I don't think I can define separate concepts for HashFunc and Key . Is that correct? So I would like to define a concept HashConcept that deals with HashFunc and Key simultaneously.

So I define one concept Hash that deals with both. I try my best to define the concept so that it matches the named requirement for Hash here . The goal then is to satisfy 4 conditions. Below this list, I talk about trying to enforce these conditions.

  1. The return type is convertible to std::size_t .
  2. The hash function displays equality preservation (h(k1) == h(k1) for the duration of the program. see C++ Extensions for Ranges section 19.1.1)
  3. If u is an lvalue Key , then h(u) does not modify u .
  4. "The probability of h(a)==h(b) for a!=b should approach 1.0/std::numeric_limits<std::size_t>::max() ."

Does this list appear complete?

I don't believe concepts can enforce (4), and (4) will just need to be indicated in comments/documentation. I believe that concepts might be able to enforce (2) and (3), but I'm not sure how. C++ Extensions for Ranges section 19.5 defines the concepts Callable and RegularCallable , then says "Note: The distinction between Callable and RegularCallable is purely semantic. — end note", suggesting that (2) cannot be enforced. That leaves (1) and (3).

I define a concept that enforces (1).

template<typename HashFunc, typename Key>
concept Hash = requires(HashFunc h, Key k) {
    { std::invoke(h, k) } -> std::convertible_to<std::size_t>;
};

Is this concept correct? (eg, should I have used requires or returned a bool ?) Can my concept be extended to address other requirements for hash functions, such as (2)-(4)?

Below is some example code that uses the concept. The result is to print 3 to the std::cout .

#include <functional>
#include <concepts>
#include <iostream>

template<typename HashFunc, typename Key>
concept HashConcept = requires(HashFunc h, Key k) {
    { std::invoke(h, k) } -> std::convertible_to<std::size_t>;
};


class HashFunc {
public:
    std::size_t operator()(int i) {
        return static_cast<size_t>(i);
    }
};

template<typename Hash, typename Key>
    requires HashConcept<Hash, Key>
size_t HashConceptUser(Hash h, Key k) {
    return h(k);
}

int main() {
    std::cout << HashConceptUser< HashFunc, int >(HashFunc{}, 3); 

}

Does this list appear complete?

The list is missing arguably the single most important criteria for a hash function: that if a == b then h(a) == h(b) .

The 4th criterion on the list is something you want for good hash functions, and is itself somewhat incomplete - you don't just want the likelihood of collision to be small, you also want random dispersion. The hash function h(i) = i satisfies the 4th criterion, but not a good hash function. On the flip side, h(i) = 0 is a terrible hash function but should be considered valid.


That said, C++ language concepts cannot enforce any of these things - you cannot enforce that the hash function is equality-preserving, you cannot enforce that it doesn't modify its inputs, and you cannot enforce anything about the distribution of its results. Those are what we would call semantic constraints rather than syntactic ones (the C++ standard speaks of satsifying a concept if the syntactic constraints are met and modeling a concept if the syntactic and semantic ones are met). The semantic constraints are documented requirements (in comments, or just documentation) rather than coded ones.

The best you can do the syntax is just verify that the hash function is invocable and gives you an integer:

template <typename F, typename T>
concept HashFor = std::regular_invocable<F, T>
               && std::convertible_to<std::invoke_result_t<F, T>, size_t>;

I am using regular_invocable here because that concept adds semantic constraints that you want: that the function call is equality-preserving and does not modify the function object or its arguments. You could also write it this way:

template <typename F, typename T>
concept HashFor = std::regular_invocable<F, T>
    && requires(F f, T t) {
        { std::invoke(f, t) } -> std::convertible_to<size_t>;
    };

But I would keep the regular_invocable part.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM