简体   繁体   English

快速而优雅的已知整数值的单向映射

[英]Fast and elegant one-way mapping of known integer values

I have to map a set of known integers to another set of known integers, 1-to-1 relationship, all predefined and so on. 我必须将一组已知整数映射到另一组已知整数,一对一关系,所有预定义等等。 So, suppose I have something like this (c++, simplified, but you'll get the idea): 所以,假设我有这样的东西(c ++,简化了,但是你会明白的):

struct s { int a; int b; };

s theMap[] = { {2, 5}, {79, 12958 } };

Now given an input integer, say 79, I'd need to find the corresponding result from theMap (obviously 12958). 现在给定一个输入整数,例如79,我需要从Map中找到相应的结果(显然是12958)。 Any nice and fast method of doing this, instead of your run-of-the-mill for loop? 有什么不错而又快速的方法来代替循环运行吗? Other data structure suggestions are also welcome, but the map should be easy to write in the source by hand. 也欢迎其他数据结构建议,但该地图应易于在源代码中手工编写。

The values in both sets are in the range of 0 to 2^16, and there are only about 130 pairs. 两组中的值都在0到2 ^ 16的范围内,并且只有大约130对。 What I also am after is a very simple way of statically initializing the data. 我还想要的是一种非常简单的静态初始化数据的方法。

Use a map 使用地图

#include <map>
#include <iostream>

int main() {
   std::map <int, int> m;
   m[79] = 12958; 
   std::cout << m[79] << std::endl;
}

Using a map is the most general solution and the most portable (the C++ standard does not yet support hash tables, but they are a very common extension). 使用映射是最通用的解决方案,也是最可移植的(C ++标准尚不支持哈希表,但它们是非常常见的扩展)。 It isn't necessariily the fastest though. 它不一定是最快的。 Both the binary search and the hashmap solutions suggested by others may (but not will) out-perform it. 其他人建议的二进制搜索和哈希图解决方案都可能(但不会)胜过它。 This probably won't matter for most applications, however. 但是,这对于大多数应用程序可能并不重要。

按键对数组进行排序,然后执行二进制搜索。

If you need compile time mapping you could use the following template: 如果需要编译时映射,则可以使用以下模板:

// template to specialize
template<int T> struct int2int {};    

// macro for simplifying declaration of specializations
#define I2I_DEF(x, v) template<> struct int2int<x> { static const int value = v; };

// definitions
I2I_DEF(2, 5) I2I_DEF(79, 12958) I2I_DEF(55, 100) // etc.

// use
#include <iostream>    
int main()
{
  std::cout << int2int<2>::value << " " << int2int<79>::value << std::endl;

  return 0;
}

If the number of your source integers i is relatively high (so that a direct search becomes inefficient) but still manageable, you can relatively easily build a perfect hash function hash(i) for your input integers (using Pearson hashing , for example) and then use the hashed value as the entry into the output table map 如果您的源整数i数量相对较高(因此直接搜索变得效率低下)但仍可管理,则可以相对轻松地为输入整数构建完美的哈希函数hash(i) (例如,使用Pearson哈希 ),并且然后使用哈希值作为输出表map的条目

output = map[hash(i)];

Of course, if the range of the input values is relatively small, you can use the identity function in place of hash and just turn the whole thing into a straghforward remapping 当然,如果输入值的范围相对较小,则可以使用恒等函数代替hash而只需将整个对象变成直接重映射

output = map[i];

(although if that was the case you wouldn't probably even ask.) (尽管是这种情况,您甚至都不会问。)

std::map<int, int> theMap;
theMap[2] = 5;
std::map<int, int>::const_iterator iter = theMap.find(2);
if (iter != theMap.end())
   iter->second; // found it

Insert pairs of ints, retrieve value by key, logarithmic complexity. 插入整数对,按键,对数复杂度检索值。 If you have a really large data set and need faster retrieval use std::tr1::unordered_map or boost::unordered_map (in case your standard library doesn't have TR1 implementation). 如果您的数据集非常大并且需要更快的检索速度,请使用std :: tr1 :: unordered_map或boost :: unordered_map(以防您的标准库没有TR1实现)。

std::map or std::unordered_map is probably the cleanest you'll get. std :: mapstd :: unordered_map可能是最干净的了。 Unfortunately C++ has no built-in associative arrays. 不幸的是,C ++没有内置的关联数组。

std::map<int,int> mymap; // the same with unordered map

// one way of inserting
mymap.insert ( std::make_pair(2,5) );
mymap.insert ( std::make_pair(79,12958) );

// another
mymap[2] = 5;
mymap[79] = 12958;

To check 去检查

std::map<int,int>::const_iterator iter = mymap.find(2);
if ( iter != mymap.end() )
{
   // found
   int value = iter->second;
}

unordered_map has the advantage of O(1) amortized lookup time as opposed to O(log n) of map . map O(log n)相比, unordered_map具有O(1)摊销查找时间的优势。

As a supplementary, if you need a binary search implementation, don't overlook the C++ Standard Library. 作为补充,如果您需要二进制搜索实现,请不要忽略C ++标准库。 The following does one on an array of your structure type using the equal_range algorithm (apologies for the somewhat hacky quality of the code) 下面的代码使用equal_range算法对结构类型的数组执行一个操作(为代码质量有些怪异而道歉)

#include <algorithm>
#include <iostream>
using namespace std;

struct S {
    int k, v;
};

bool operator <( const S & a, const S & b ) {
    return a.k < b.k;
};

// must be sorted in key order
S values[] = {{42,123},{666,27}};

int main() {

    S t;
    cin >> t.k;

    S * valend = &values[0] + sizeof(values) / sizeof(S);
    pair <S*,S*> pos = equal_range( &values[0], valend , t);

    if ( pos.first != pos.second ) {
        cout << pos.first->v << endl;
    }
    else {
        cout << "no" << endl;
    }
}

Why not a hashed map? 为什么不使用哈希图? It will give you more or less constant retrieval times for any key. 它将为您提供或多或少的恒定检索时间。

You had the right idea, it's a map. 您有正确的想法,这是一张地图。 Use std::map . 使用std :: map

Jump table. 跳转表。 A switch will likely set this up if you are able to use that, otherwise you may need some assembly but that's probably the fastest way. 如果您能够使用交换机,则可能会进行设置,否则可能需要进行一些组装,但这可能是最快的方法。

You can use boost::assign. 您可以使用boost :: assign。

#include <iostream>
#include <boost/assign.hpp>

int main()
{
    typedef std::map< int, int > int2int_t;
    typedef int2int_t::const_iterator int2int_cit;

    const int2int_t theMap
        = boost::assign::map_list_of
            ( 2, 5 )
            ( 79, 12958 )
            ;

    int2int_cit it = theMap.find( 2 );
    if ( it != theMap.end() )
    {
        const int result = it->second;
        std::cout << result << std::endl;
    }
}

If you are 100% certain theMap won't grow to over 1,000 entries (profile!), it's probably faster to do a binary search. 如果您100%确信theMap不会增长到超过1,000个条目(配置文件!),那么执行二进制搜索可能会更快。

If the the value of a has a reasonable bound (eg below 1,000), you can just make a simple array with a as the index for guaranteed O(1) complexity. 如果a的值有一个合理的界线(例如,低于1,000),则可以制作一个简单的数组,并以a作为保证O(1)复杂度的索引。 If you're using gcc you can use this syntax ( http://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html#Designated-Inits ): 如果您使用的是gcc,则可以使用以下语法( http://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html#Designated-Inits ):

int theMap[256] = { [2] = 5, [79] = 12958 };

(This is not supported by g++, unfortunately) (很遗憾,g ++不支持此功能)

In any other cases, use std::unordered_map as shown in the other answers. 在其他任何情况下,请使用std::unordered_map ,如其他答案所示。

Your pseudocode is almost valid C++0x code — but C++0x requires less! 您的伪代码几乎是有效的C ++ 0x代码-但是C ++ 0x需要的更少!

map<int, int> theMap = { {2, 5}, {79, 12958 } };
assert ( theMap[ 2 ] == 5 );

In "normal" C++, you have to initialize the map like this, still quite elegant: 在“普通” C ++中,您必须像这样初始化地图,但仍然很优雅:

pair< int, int > map_array[2] = { make_pair(2, 5), make_pair(79, 12958) };
map< int, int > theMap( &map_array[0], &map_array[2] ); // sorts the array
assert ( theMap[ 2 ] == 5 );

This is fast to write and fast to run! 这写起来很快,运行起​​来也快!

Edit: Just don't make the map a global variable. 编辑:只是不要使地图成为全局变量。 (Although that is safe in C++0x.) If you do, it will only be initialized properly if the compiler chooses to initialize it after map_array, which is VERY not guaranteed. (尽管这在C ++ 0x中是安全的。)如果这样做,则只有在编译器选择在map_array之后选择对其进行初始化时,它才会被正确初始化,这是绝对不能保证的。 If you want to to be a global, initialize it with theMap.assign( &map_array[0], &map_array[2] ); 如果要成为全局对象,请使用theMap.assign( &map_array[0], &map_array[2] );对其进行初始化theMap.assign( &map_array[0], &map_array[2] ); .

There is also a technique known as "xmacros" which is a nice way to do exactly what you are talking about as well. 还有一种称为“ xmacros”的技术,它也是一种很好的方法,可以准确地完成您正在谈论的内容。 However, it is easy to abuse the technique so I always recommend using it with care. 但是,很容易滥用该技术,因此我总是建议谨慎使用它。 Check out: http://en.wikipedia.org/wiki/C_preprocessor#X-Macros 检出: http : //en.wikipedia.org/wiki/C_preprocessor#X-Macros

The basic gist is, you have a file where you list out your mappings say foo.txt which looks like this: MAP(2,5) 基本要点是,您有一个文件,其中列出了映射文件foo.txt,如下所示:MAP(2,5)
MAP(79,12958) 地图(79,12958)
... ...

Then you define a macro MAP(A,B) that takes those two arguments and does your initialization for you. 然后,定义一个宏MAP(A,B),它接受这两个参数并为您进行初始化。 Then #include the file (foo.txt). 然后#include文件(foo.txt)。 You can even do it in multiple passes if you like by redefining the macro between each #include of the file. 如果您愿意,甚至可以通过在文件的每个#include之间重新定义宏来进行多次遍历。 Then to add more mappings you simply add them to foo.txt and recompile. 然后,要添加更多映射,只需将它们添加到foo.txt并重新编译。 It is very powerful and can be used for many different things. 它非常强大,可以用于许多不同的事物。

If you don't want to use a map for whatever reason, (for example, you just want to use the array you set up at compile time), you can also use a functor in combination with <algorithm> : 如果出于某种原因不想使用映射(例如,您只想使用在编译时设置的数组),则还可以将函子<algorithm>结合使用:

#include <windows.h>
#include <cstdlib>
#include <functional>
#include <algorithm>
#include <iostream>
using namespace std;

struct s { int a; int b; };

s theMap[] = { {2, 5}, {79, 12958 } };

struct match_key : public unary_function<s, bool>
{
    match_key(int key) : key_(key) {};
    bool operator()(const s& rhs) const
    {
        return rhs.a == key_;
    }
private:
    int key_;
};

int main()
{
    size_t mapSize = sizeof(theMap)/sizeof(theMap[0]);
    s* it = find_if(&theMap[0], &theMap[mapSize], match_key(79));
    cout << it->b;

    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM