简体   繁体   中英

Performance of dynamic_cast

I previous asked a question Why is dynamic_cast evil or not ? The answers made me to write some code about performance of dynamic_cast as follows.And I compiled and tested, the time consumed by dynamic_cast is slightly bigger than that without dynamic_cast .I didn't see the evidence of dynamic_cast is time consuming.Did I write the right code ?

The code is :

class Animal
{
public:
    virtual ~Animal(){};
};

class Cat : public Animal
{
public:
    std::string param1;
    std::string param2;
    std::string param3;
    std::string param4;
    std::string param5;
    int param6;
    int param7;
};

bool _process(Cat* cat)
{
    cat->param1 = "abcde";
    cat->param2 = "abcde";
    cat->param3 = "abcde";
    cat->param4 = "abcde";
    cat->param5 = "abcde";
    cat->param6 = 1;
    cat->param7 = 2;
    return true;
}

bool process(Animal *ptr)
{
    Cat *cat = dynamic_cast<Cat*>(ptr);
    if (cat == NULL)
    {
        return false;
    } 
    _process(cat);
    return true;
}
int main(int argc, char* argv[])
{
    /*
    argv[1] : object num
    */

    if (argc != 2)
    {
        std::cout << "Error: invalid argc " << std::endl;
        return -1;
    }

    int obj_num = atoi(argv[1]);
    if (obj_num <= 0)
    {
        std::cout << "Error: object num" << std::endl;
    }

    int c = 0;
    for (; c < obj_num; c++)
    {
        Cat cat;
        #ifdef _USE_CAST
        if (!process(&cat))
        {
            std::cout << "Error: failed to process " << std::endl;
            return -3;
        }
        #else
        if (!_process(&cat))
        {
            std::cout << "Error: failed to process " << std::endl;
            return -3;
        }

        #endif
    }

    return 0;
}

compile it using:

g++ -D_USE_CAST -o dynamic_cast_test  dynamic_cast_benchmark.c
g++ -o dynamic_cast_no_test dynamic_cast_benchmark.c

execute them using num, which is 1,10,100 ...:

$time ./dynamic_cast_test num
$time ./dynamic_cast_no_test num

The result:

                 dynamic_cast               non_dynamic_cast
num  10,000   
                real    0m0.010s            real    0m0.008s
                user    0m0.006s            user    0m0.006s
                sys     0m0.001s            sys     0m0.001s

     100,000 
                real    0m0.059s            real    0m0.056s
                user    0m0.054s            user    0m0.054s
                sys     0m0.001s            sys     0m0.001s

     1,000,000
                real    0m0.523s            real    0m0.519s
                user    0m0.517s            user    0m0.511s
                sys     0m0.001s            sys     0m0.004s

     10,000,000
                real    0m6.050s            real    0m5.126s
                user    0m5.641s            user    0m4.986s
                sys     0m0.036s            sys     0m0.019s

     100,000,000
                real    0m52.962s           real    0m51.178s
                user    0m51.697s           user    0m50.226s
                sys     0m0.173s            sys     0m0.092s

hardware & os:

OS:Linux
CPU:Intel(R) Xeon(R) CPU E5607  @ 2.27GHz  (4 cores)

You did write the right code, althought I would not have hard-coded the type to be a Cat. You can, just to play on the safe side, use the command line argument to decide wether to build a cat or a, say, dog (which you should implement also). Try disabling optimization also, in order to see if it's playing a significant role.

Finally, a word of caution is in order. Profiling is not as simple as taking a measurement on your computer, so you must be aware that what you are doing only takes you so far. It does give you an idea, don't think you are getting any everything-encompassing answer though.

I will reformulate my post.

Your code is correct, and it compiles well.

Since virtual methods and dynamic_cast operator are related issues, check this information from wiki, hope it will be useful.

wiki:

A virtual call requires at least an extra indexed dereference, and sometimes a "fixup" addition, compared to a non-virtual call, which is simply a jump to a compiled-in pointer. Therefore, calling virtual functions is inherently slower than calling non-virtual functions. An experiment done in 1996 indicates that approximately 6–13% of execution time is spent simply dispatching to the correct function, though the overhead can be as high as 50%.[4] The cost of virtual functions may not be so high on modern CPU architectures due to much larger caches and better branch prediction.

Furthermore, in environments where JIT compilation is not in use, virtual function calls usually cannot be inlined. While a compiler could replace the lookup and indirect call with, for instance, a conditional execution of each inlined body, such optimizations are not common.

To avoid this overhead, compilers usually avoid using vtables whenever the call can be resolved at compile time.

Thus, the call to f1 above may not require a vtable lookup because the compiler may be able to tell that d can only hold a D at this point, and D does not override f1. Or the compiler (or optimizer) may be able to detect that there are no subclasses of B1 anywhere in the program that override f1. The call to B1::f1 or B2::f2 will probably not require a vtable lookup because the implementation is specified explicitly (although it does still require the 'this'-pointer fixup).

Also, as you probably know, when you declare a virtual method in your class, depends on realization but almost always, your compiler will implicitly add virtual method table as new memeber to your class, thus each instance of this class will occupy more memory space, try sizeof on class with vm and without it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM