简体   繁体   English

使 std 的数据结构默认使用我现有的非静态哈希函数“hashCode()”

[英]Make std's data-structure use my existing non-static hash function “hashCode()” by default

I have a moderate-size codebase (>200 .cpp ) that use a function hashCode() to return hash number:-我有一个中等大小的代码库(> 200 .cpp ),它使用函数hashCode()返回哈希数:-

class B01{  //a class
    //..... complex thing ....
    public: size_t hashCode(){ /* hash algorithm #H01 */}  
};
class B02{  //just another unrelated class
    //..... complex thing ....
    public: size_t hashCode(){/* #H02 */}  //This is the same name as above
};

I have used it in various locations, eg in my custom data-structure.我在不同的地方使用过它,例如在我的自定义数据结构中。 It works well.它运作良好。

Now, I want to make the hash algorithm recognized by std:: data structure:-现在,我想让std::数据结构识别哈希算法:-

Here is what I should do :- (modified from cppreference , I will call this code #D ).这是我应该做的:-(cppreference修改,我将此代码称为#D )。

//#D
namespace std {
    template<> struct hash<B01> {
        std::size_t operator()(const B01& b) const {
            /* hash algorithm #H01 */
        }
    };
}

If I insert the block #D (with appropriate implementation) in every class ( B01 , B02 ,...), I can call :-如果我在每个类( B01B02 ,...)中插入块#D (具有适当的实现),我可以调用:-

std::unordered_set<B01> b01s;
std::unordered_set<B02> b02s;

without passing the second template argument,传递第二个模板参数,
and my hash algorithm ( #H01 ) will be called.我的哈希算法( #H01 )将被调用。 (by default ) 默认

Question

To make it recognize all of my B01::hashCode, B02::hashCode, ... ,为了让它识别我所有的B01::hashCode, B02::hashCode, ... ,
do I have to insert the block #D into all 200+ Bxx.h ?我必须将块#D插入所有 200+ Bxx.h吗?

Can I just add a single block #D (in a top header?)我可以再补充一个区块#D (在顶部集流管?)
and, from there, re-route std::anyDataStructure to call hashCode() whenever possible?并且,从那里开始,尽可能重新路由std::anyDataStructure以调用hashCode()

//pseudo code
namespace std{
    template<> struct hash<X>   {
        std::size_t operator()(const X& x) const { // std::enable_if??
            if(X has hashCode()){    //e.g. T=B01 or B02       
                make this template highest priority   //how?
                return hashCode();
            }else{                   //e.g. T=std::string
                don't match this template;  
            }
        }
    };
}

It sounds like a SFINAE question to me.对我来说,这听起来像是 SFINAE 的问题。

Side note: The most similar question in SO didn't ask about how to achieve this.旁注: SO 中最相似的问题没有询问如何实现这一点。

Edit (Why don't I just refactor it? ; 3 Feb 2017)编辑(我为什么不重构它?;2017 年 2 月 3 日)

  • I don't know if brute force refactoring is a right path.我不知道蛮力重构是否是一条正确的道路。 I guess there might be a better way.我想可能有更好的方法。
  • hashCode() is my home. hashCode()是我的家。 I emotionally attach to it.我在情感上依恋它。
  • I want to keep my code short and clean as possible.我想保持我的代码尽可能简短和干净。 std:: blocks are dirty. std::块很脏。
  • It may be just my curiosity.这可能只是我的好奇心。 If I stubborn not to refactor my code, how far C++ can go?如果我固执的不重构我的代码,C++还能走多远?

It doesn't have to be that way, you can also have a functor:不一定是这样,你也可以有一个函子:

struct MyHash {
    template <class T>
    auto hashCode(const T & t, int) const -> decltype(t.hashCode()) {
        return t.hashCode();
    }
    template <class T>
    auto hashCode(const T & t, long) const -> decltype(std::hash<T>{}(t)) {
        return std::hash<T>{}(t);
    }
    
    template <class T>
    auto operator()(const T & t) const -> decltype(hashCode(t,42)) {
        return hashCode(t,42);
    }
};

And have an alias of std::unordered_set with MyHash as hash type:并有一个std::unordered_set的别名,其中MyHash作为哈希类型:

template <class Key>
using my_unordered_set = std::unordered_set<Key, MyHash>;

or more complete if you also want to be able to provide Equal functor and allocator:或者更完整,如果您还希望能够提供Equal functor 和 allocator:

template<
    class Key,
    class KeyEqual = std::equal_to<Key>,
    class Allocator = std::allocator<Key>
>
using my_unordered_set = std::unordered_set<Key, MyHash, KeyEqual, Allocator>;

Then using it (with any of your Bxx) like you'd use std::unordered_set :然后像使用std::unordered_set一样使用它(与您的任何 Bxx 一起使用):

int main() {
    my_unordered_set<B01> b01s;
    my_unordered_set<B02> b02s;

    // or lonely with your type:
    B01 b01{/*...*/};
    std::cout << MyHash{}(b01) << std::endl;

    // or any other:
    std::string str{"Hello World!"};
    std::cout << MyHash{}(str) << std::endl;
}

Concepts概念

If you can use concepts , they can allow you to specialize std::hash class the way you want:如果您可以使用概念,它们可以让您以您想要的方式专门化std::hash类:

template <class T>
concept HashCodeConcept = requires(T const & t)
{
    {t.hashCode()} -> std::same_as<std::size_t>;
};

namespace std {
    template <HashCodeConcept T>
    struct hash<T> {
        std::size_t operator()(const T& t) const {
            return  t.hashCode();
        }
    };
}

While creating conditions to default the hash parameter of std container templates to member methods of groups of classes, one should avoid introducing new issues.在创建条件将 std 容器模板的哈希参数默认为类组的成员方法时,应避免引入新问题。

  • Redundancy冗余
  • Portability problems便携性问题
  • Arcane constructs奥术构造

The classic object oriented approach may require a patterned edit of the 200+ classes to ensure they provide the basics of std::hash container use.经典的面向对象方法可能需要对 200 多个类进行模式化编辑,以确保它们提供 std::hash 容器使用的基础知识。 Some options for group transformation are given below to provide the two needed methods.下面给出了一些组转换选项,以提供两种所需的方法。

  • A public hashCode() is defined in the concrete class where it is unique to that class or by inheritance if it follows a pattern common across classes.公共 hashCode() 在具体类中定义,如果它遵循类间通用的模式,则它对该类是唯一的,或者通过继承定义。
  • A public operator==() is defined.定义了一个公共操作符==()。

The Two Templates两个模板

These two templates will remove the redundancy and simplify the declaration as indicated.这两个模板将删除冗余并按照指示简化声明。

template <typename T>
    struct HashStruct {
        std::size_t operator()(const T & t) const {
            return t.hashCode();
        } };
template <class T>
    using SetOfB = std::unordered_set<T, HashStruct<T>>;

Saving Integration Time节省积分时间

An example super-class:一个示例超类:

class AbstractB {
    ...
    virtual std::size_t hashCode() const {
        return std::hash<std::string>{}(ms1)
                ^ std::hash<std::string>{}(ms2);
    } }

The following sed expression may save transformation time, assuming the code uses { inline.下面的 sed 表达式可以节省转换时间,假设代码使用 { inline. Similar expressions would work with Boost or using a scripting language like Python.类似的表达式适用于 Boost 或使用像 Python 这样的脚本语言。

"s/^([ \t]*class +B[a-zA-Z0-9]+ *)(:?)(.*)$"
        + "/\1 \2 : public AbstractB, \3 [{]/"
        + "; s/ {2,}/ /g"
        + "; s/: ?:/:/g"

An AST based tool would be more reliable.基于 AST 的工具会更可靠。 This explains how to use clang capabilities for code transformation. 解释了如何使用 clang 功能进行代码转换。 There are new additions such as this Python controller of C++ code transformation.有一些新增功能,例如 C++ 代码转换的Python 控制器

Discussion讨论

There are several options for where the hash algorithm can reside.散列算法可以驻留的位置有多种选择。

  • A method of a std container declaration's abstract class std 容器声明的抽象类的方法
  • A method of a concrete class (such as #H01 in the example)一个具体类的方法(如示例中的#H01)
  • A struct template (generally counterproductive and opaque)结构体模板(通常适得其反且不透明)
  • The default std::hash默认的 std::hash

Here's a compilation unit that provides a clean demonstration of the classic of how one might accomplish the desired defaulting and the other three goals listed above while offering flexibility in where the hash algorithm is defined for any given class.这是一个编译单元,它清晰地演示了如何实现所需的默认设置和上面列出的其他三个目标的经典,同时为任何给定类定义散列算法的位置提供了灵活性。 Various features could be removed depending on the specific case.可以根据具体情况删除各种功能。

#include <string>
#include <functional>
#include <unordered_set>

template <typename T>
    struct HashStructForPtrs {
        std::size_t operator()(const T tp) const {
            return tp->hashCode(); } };
template <class T>
    using SetOfBPtrs = std::unordered_set<T, HashStructForPtrs<T>>;

template <typename T>
    struct HashStruct {
        std::size_t operator()(const T & t) const {
            return t.hashCode(); } };
template <class T>
    using SetOfB = std::unordered_set<T, HashStruct<T>>;

class AbstractB {
    protected:
        std::string ms;
    public:
        virtual std::size_t hashCode() const {
            return std::hash<std::string>{}(ms); }
        // other option: virtual std::size_t hashCode() const = 0;
        bool operator==(const AbstractB & b) const {
            return ms == b.ms; } };

class B01 : public AbstractB {
    public:
        std::size_t hashCode() const {
            return std::hash<std::string>{}(ms) ^ 1; } };

class B02 : public AbstractB {
    public:
        std::size_t hashCode() const {
            return std::hash<std::string>{}(ms) ^ 2; } };

int main(int iArgs, char * args[]) {

    SetOfBPtrs<AbstractB *> setOfBPointers;
    setOfBPointers.insert(new B01());
    setOfBPointers.insert(new B02());

    SetOfB<B01> setOfB01;
    setOfB01.insert(B01());

    SetOfB<B02> setOfB02;
    setOfB02.insert(B02());

    return 0; };

A SFINAE based method of the type you were looking for requires partial specialisation of std::hash .您正在寻找的类型的基于 SFINAE 的方法需要std::hash This could be done if your classes Bxx are templates (which is the case if they are derived from a CRTP base).如果您的类Bxx是模板(如果它们是从 CRTP 基础派生的就是这种情况),则可以这样做。 For example (note fleshed out in edit)例如(注释在编辑中充实)

#include <type_traits>
#include <unordered_set>
#include <iostream>

template<typename T = void>
struct B {
  B(int i) : x(i) {}
  std::size_t hashCode() const
  {
    std::cout<<"B::hashCode(): return "<<x<<std::endl;
    return x;
  }
  bool operator==(B const&b) const
  { return x==b.x; }
private:
  int x;
};

template<typename T,
         typename = decltype(std::declval<T>().hashCode())> 
using enable_if_has_hashCode = T;

namespace std {
  template<template<typename...> class T, typename... As> 
  struct hash<enable_if_has_hashCode<T<As...>>> 
  {
    std::size_t operator()(const T<As...>& x) const
    { return x.hashCode(); }
  };
  // the following would not work, as its not a partial specialisation
  //    (some compilers allow it, but clang correctly rejects it)
  // tempate<typename T>
  // struct hash<enable_if_hashCode<T>>
  // { /* ... */ }; 
}

int main()
{
  using B00 = B<void>;
  B00 b(42);
  std::unordered_set<B00> set;
  set.insert(b);
}

produces (using clang++ on MacOS)产生(在 MacOS 上使用 clang++)

B::hashvalue(): return 42 B::hashvalue(): 返回 42

see also this related answer to a similar question of mine.另请参阅我的类似问题的相关答案

However, concepts are the way of the future to solve problems like this.然而,概念是未来解决此类问题的方式。

I have come up with something that appears to partially work.我想出了一些似乎部分起作用的东西。 It is a workaround that will allow you to use std::hash on a type that implements hashCode .这是一种解决方法,允许您在实现hashCode的类型上使用std::hash Take a look:看一看:

   //some class that implements hashCode
struct test
{
    std::size_t hashCode() const
    {
        return 0;//insert your has routine
    }
};
//helper class
struct hashable
{
    hashable():value(0){}
    template<typename T>
    hashable(const T& t):value(t.hashCode())
    {}
    template<typename T>
    std::size_t operator()(const T& t) const
    {
        return t.hashCode();
    }

    std::size_t value;
};


//hash specialization of hashable
namespace std {
    template<>
    struct hash<hashable>
    {
        typedef hashable argument_type;
        typedef std::size_t result_type;
        result_type operator()(const argument_type& b) const {
            return b.value;
        }
    };
}
//helper alias so you dont have to specify the hash each time.
template<typename T, typename hash = hashable>
using unordered_set = std::unordered_set<T,hash>;

int main(int argc, char** argv)
{
    unordered_set<test> s;
    test t;
    std::cout<<std::hash<hashable>{}(t)<<std::endl;
}

The code takes advantage of hashable 's template constructor and template operator to retrieve the hash from any class that implements hashCode .该代码利用hashable的模板构造函数和模板运算符从任何实现hashCode类中检索散列。 The std::hash specialization is looking for an instance of hashable but the templated constructor allows an instance to be constructed from a class that has hasCode . std::hash正在寻找hashable的实例,但模板化构造函数允许从具有hasCode的类构造实例。

The only gotcha here is that you will have to write unordered_set rather than std::unordered_set to use it and you will have to make sure that std::unordered_set is not brought into scope in any way.这里唯一的问题是您必须编写unordered_set而不是std::unordered_set才能使用它,并且您必须确保std::unordered_set不会以任何方式进入范围。 So you wont be able to have anything like using namespace std or using std::unordered_set in your source.因此,您将无法在源代码中using std::unordered_set诸如using namespace stdusing std::unordered_set类的东西。 But besides the few gotchas in the usage this could work for you.但除了使用中的一些问题外,这对您有用。

Of course this is just a band-aid on the real issue... which would be not wanting to go through the pain of properly specializing std::hash for each of your types.当然,这只是对真正问题的创可贴......这不会想要经历为每种类型正确专门化std::hash的痛苦。 (I don't blame you) (我不怪你)

I would also like to note that with this code substitution is an error ... if you would prefer SFINAE it will need modification.我还想指出,使用此代码替换是一个错误……如果您更喜欢 SFINAE,则需要对其进行修改。

EDIT:编辑:

After trying to run:尝试运行后:

unordered_set<test> s;
test t;
s.insert(t);

I noticed there were some compiler errors.我注意到有一些编译器错误。

I've updated my test class to be equality comparable by adding:我通过添加以下内容更新了我的test类,使其equality comparable

bool operator==(const test& other) const
{
    return hashCode() == other.hashCode();
}

to test which now makes: test现在使:

//some class that implements hashCode
struct test
{
    std::size_t hashCode() const
    {
        return 0;//insert your has routine
    }
    bool operator==(const test& other) const
    {
        return hashCode() == other.hashCode();
    }
};

Solution one解决方案一

If you can make classes B01 , B02 , ... class templates with dummy parameters you could simply go along with the specialization of the std::hash for template template that takes dummy template parameter:如果您可以使用虚拟参数创建类B01B02 , ... 类模板,您可以简单地使用std::hash的专业化模板模板,该模板采用虚拟模板参数:

#include <iostream>
#include <unordered_set>

struct Dummy {};

template <class = Dummy>
class B01{ 
    public: size_t hashCode() const { return 0; }  
};
template <class = Dummy>
class B02{ 
    public: size_t hashCode() const { return 0; } 
};

namespace std{
    template<template <class> class TT> struct hash<TT<Dummy>>   {
        std::size_t operator()(const TT<Dummy>& x) const { 
            return x.hashCode();
        }
    };
}

int main() {
    std::unordered_set<B01<>> us;
    (void)us;
}

[live demo] [现场演示]

Solution two (contain error/don't use it)解决方案二(包含错误/不要使用它)

But I believe what you desire looks more like this:但我相信你想要的更像是这样:

#include <iostream>
#include <unordered_set>

class B01{ 
    public: size_t hashCode() const { return 0; }  
};

class B02{ 
    public: size_t hashCode() const { return 0; } 
};

template <class T, class>
using enable_hash = T;

namespace std{
    template<class T> struct hash<enable_hash<T, decltype(std::declval<T>().hashCode())>>   {
        std::size_t operator()(const T& x) const { 
            return x.hashCode();
        }
    };
}

int main() {
    std::unordered_set<B01> us;
    (void)us;
}

[live demo] [现场演示]

(Inspired by this answer ) (灵感来自这个答案

However as long this can work on gcc it isn't really allowed by the c++ standard (but I'm also not sure if it is actually literally disallowed...) .然而,只要这可以在 gcc 上工作,它就不是 c++ 标准真正允许的 (但我也不确定它是否实际上是被禁止的......) See this thread in this context.在此上下文中查看线程。

Edit:编辑:

As pointed out by @Barry this gcc behaviour is not mandated by c++ standard and as such there is absolutely no guaranties it will work even in the next gcc version... It can be even perceived as a bug as it allows partial specialization of a template that in fact does not specialize that template.正如@Barry 所指出的,这个 gcc 行为不是c++ 标准强制要求的,因此绝对没有保证它即使在下一个 gcc 版本中也能工作......它甚至可以被视为一个错误,因为它允许部分专业化实际上并不专门化该模板的模板。

Solution three (preffered)解决方案三(首选)

Another way could be to specialize std::unordered_set instead of std::hash :另一种方法是专门化std::unordered_set而不是std::hash

#include <iostream>
#include <type_traits>
#include <unordered_set>

class specializeUnorderedSet { };

class B01: public specializeUnorderedSet { 
    public: size_t hashCode() const { return 0; }  
};

class B02: public specializeUnorderedSet { 
    public: size_t hashCode() const { return 0; } 
};

template <class T>
struct my_hash {
    std::size_t operator()(const T& x) const { 
        return x.hashCode();
    }
};

template <class...>
using voider = void;

template <class T, class = void>
struct hashCodeTrait: std::false_type { };

template <class T>
struct hashCodeTrait<T, voider<decltype(std::declval<T>().hashCode())>>: std::true_type { };

namespace std{

    template <class T>
    struct unordered_set<T, typename std::enable_if<hashCodeTrait<T>::value && std::is_base_of<specializeUnorderedSet, T>::value, std::hash<T>>::type, std::equal_to<T>, std::allocator<T>>:
           unordered_set<T, my_hash<T>, std::equal_to<T>, std::allocator<T>> { };

}

int main() {
    std::unordered_set<B01> us;
    (void)us;
}

According to the discussion presented here it should be perfectly valid.根据这里提出的讨论它应该是完全有效的。 It also work in gcc , clang , icc , VS它也适用于gccclangiccVS

To be able to use the code without interfering in the code of classes I believe we can utilize the ADL rules to make sfinae check if given class does not involve std namespace.为了能够在不干扰类代码的情况下使用代码,我相信我们可以利用 ADL 规则进行 sfinae 检查,如果给定的类不涉及 std 命名空间。 You can find a background here .您可以在此处找到背景。 Credits also to Cheers and hth.归功于 Cheers 和 hth。 - Alf . - 阿尔夫 The approach could be change as follows:该方法可以更改如下:

#include <utility>
#include <unordered_set>
#include <string>
#include <type_traits>
#include <functional>

template< class Type >
void ref( Type&& ) {}

template< class Type >
constexpr auto involve_std()
   -> bool
{
    using std::is_same;
    using std::declval;
    return not is_same< void, decltype( ref( declval<Type &>() ) )>::value;
}

class B01 { 
    public: size_t hashCode() const { return 0; }  
};

class B02 { 
    public: size_t hashCode() const { return 0; } 
};

template <class T>
struct my_hash {
    std::size_t operator()(const T& x) const { 
        return x.hashCode();
    }
};

template <class...>
struct voider {
    using type = void;
};

template <class T, class = void>
struct hashCodeTrait: std::false_type { };

template <class T>
struct hashCodeTrait<T, typename voider<decltype(std::declval<T>().hashCode())>::type>: std::true_type { };

namespace std{

    template <class T>
    struct unordered_set<T, typename std::enable_if<hashCodeTrait<T>::value && !involve_std<T>(), std::hash<T>>::type, std::equal_to<T>, std::allocator<T>>:
           unordered_set<T, my_hash<T>, std::equal_to<T>, std::allocator<T>> { };

}

int main() {
    std::unordered_set<B01> usb01;
    std::unordered_set<std::string> uss;
    static_assert(std::is_base_of<std::unordered_set<B01, my_hash<B01>>, std::unordered_set<B01>>::value, "!");
    static_assert(!std::is_base_of<std::unordered_set<std::string, my_hash<std::string>>, std::unordered_set<std::string>>::value, "!");
    (void)usb01;
    (void)uss;
}

[gcc test] , [clang test] , [icc test] [gcc 4.9] [VC] [gcc 测试] , [clang 测试] , [icc 测试] [gcc 4.9] [VC]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM