简体   繁体   English

Function 指针作为标记联合中的“标记”

[英]Function pointer as “tag” in tagged union

One of the reasons why I tend to avoid using tagged unions is that I don't like the idea of the performance penalty that the switch/case statement for the tag might introduce if the number of tags is greater than 4 or so.我倾向于避免使用标记联合的原因之一是我不喜欢标签的switch/case语句在标签数量大于 4 个左右时可能引入的性能损失的想法。

I just had the idea that instead of using a tag, one could set a pointer to a function that reads the last written value in the union .我只是想到,可以设置一个指向 function 的指针,而不是使用标签,该指针读取union中最后写入的值。 For example:例如:

union u{
   char a;
   int b;
   size_t c;
};

struct fntag{
   void (*readval)(union u *ptr, void *out);
   union u val;
};

And then, whenever you write a value into val , you also update the readval pointer accordingly, so that it points to a function that reads the last field you wrote in the union.然后,每当您将值写入val时,您也会相应地更新readval指针,使其指向读取您在联合中写入的最后一个字段的 function。 Yes, there's a difficult issue, and it's where to return the read value (because the same function pointer cannot point to functions returning different types).是的,有一个难题,它是返回读取值的位置(因为相同的 function 指针不能指向返回不同类型的函数)。 I chose to return the value through a pointer to void so that such function can also be "overloaded" with C11 _Generic() and thus casting and writing into different types for the output.我选择通过指向void的指针返回值,这样 function 也可以用 C11 _Generic() “重载”,从而为 output 转换和写入不同的类型。

Of course, invoking a pointer to a function has a performance overhead, and I guess it's far heavier that checking the value of an enum , but at some point, if the number of tags is big, I believe it would be faster than switch/case .当然,调用指向 function 的指针会产生性能开销,我想检查enum的值要重得多,但在某些时候,如果标签的数量很大,我相信它会比switch/case更快switch/case

My question is: Have you ever seen this technique used somewhere?我的问题是:你见过在某处使用过这种技术吗? (I haven't, and I don't know if it's because real world application of this would require _Generic() on the readval function, which requires C11, or if it's because of some problem I'm not noticing at this moment). (我没有,而且我不知道是不是因为在现实世界中应用它需要 readval function 上的_Generic() ,它需要 C11,或者是因为某些问题我现在没有注意到) . How many tags do you guess you'd need to have for the pointer to function being faster -in current Intel CPUs- than the switch/case ?你猜你需要多少标签才能让指向 function 的指针在当前的 Intel CPU 中比switch/case更快?

You could do that.你可以这样做。 In your case a more optimization friendly function signature would be size_t size_t (*)(union u U) (all the union values can be sort of fitted into size_t and the union is small enough for passing by value to be more efficient), but even with that, function calls have non-negligible overhead that tends to be significantly larger than that of a jump through a switch-generated jump table.在您的情况下,更易于优化的 function 签名将是 size_t size_t (*)(union u U) (所有联合值都可以适合size_t并且联合足够小,可以通过值传递以提高效率),但是即使这样,function 调用也有不可忽略的开销,往往比通过交换机生成的跳转表的跳转要大得多。

Try something like:尝试类似:

#include <stddef.h>
enum en { e_a, e_b, e_c };
union u{
   char a;
   int b;
   size_t c;
};

size_t u_get_a(union u U) { return U.a; }
size_t u_get_b(union u U) { return U.b; }
size_t u_get_c(union u U) { return U.c; }

struct fntag{ size_t (*u_get)(union u U); union u u_val; };
struct entag{ enum en u_type; union u u_val; };

struct fntag fntagged1000[1000]; struct entag entagged1000[1000];

void init(void) {
    for (size_t i=0; i<1000; i++)
        switch(i%3){
        break;case 0: 
            fntagged1000[i].u_val.a = i, fntagged1000[i].u_get = &u_get_a;
            entagged1000[i].u_val.a = i, entagged1000[i].u_type = e_a;
        break;case 1: 
            fntagged1000[i].u_val.b = i, fntagged1000[i].u_get = &u_get_b;
            entagged1000[i].u_val.b = i, entagged1000[i].u_type = e_b;
        break;case 2:
            fntagged1000[i].u_val.c = i, fntagged1000[i].u_get = &u_get_c;
            entagged1000[i].u_val.c = i, entagged1000[i].u_type = e_c;
        }
}


size_t get1000fromEnTagged(void)
{
    size_t r = 0;
    for(int i=0; i<1000; i++){
        switch(entagged1000[i].u_type){
        break;case e_a: r+=entagged1000[i].u_val.a;
        break;case e_b: r+=entagged1000[i].u_val.b;
        break;case e_c: r+=entagged1000[i].u_val.c;
        /*break;default: __builtin_unreachable();*/
        }
    }
    return r;
}

size_t get1000fromFnTagged(void)
{
    size_t r = 0;
    for(int i=0; i<1000; i++) r += (*fntagged1000[i].u_get)(fntagged1000[i].u_val);
    return r;
}



int main(int C, char **V)
{
    size_t volatile r;
    init();
    if(!V[1]) for (int i=0; i<1000000; i++) r=get1000fromEnTagged();
    else for (int i=0; i<1000000; i++) r=get1000fromFnTagged();


}

At -O2, I'm getting more then twice the performance in the switch-based code.在 -O2 处,我在基于开关的代码中获得了两倍以上的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM