简体   繁体   中英

Function optimized for compile-time constant

Suppose that I have a vector length calculation function, which has an additional inc parameter (this tells the distance between neighboring elements). A simple implementation would be:

float calcLength(const float *v, int size, int inc) {
    float l = 0;

    for (int i=0; i<size*inc; i += inc) {
        l += v[i]*v[i];
    }
    return sqrt(l);
}

Now, calcLength can be called with two kind of inc parameters: when inc is known at compile-time, and when it is not. I'd like to have an optimized calcLength version for common compile-time values of inc (like 1).

So, I'd have something like this:

template <int C>
struct Constant {
    static constexpr int value() {
        return C;
    }
};

struct Var {
    int v;

    constexpr Var(int p_v) : v(p_v) { }

    constexpr int value() const {
        return v;
    }
};

template <typename INC>
float calcLength(const float *v, int size, INC inc) {
        float l = 0;

        for (int i=0; i<size*inc.value(); i += inc.value()) {
            l += v[i]*v[i];
        }
        return sqrt(l);
    }
}

So, this can be used:

calcLength(v, size, Constant<1>()); // inc is a compile-time constant 1 here, calcLength can be vectorized

or

int inc = <some_value>;
calcLength(v, size, Var(inc)); // inc is a non-compile-time constant here, less possibilities of compiler optimization

My question is, would it be possible somehow to keep the original interface, and put Constant / Var in automatically, depending on the type (compile-time constant or not) of inc ?

calcLength(v, size, 1); // this should end up calcLength(v, size, Constant<1>());
calcLength(v, size, inc); // this should end up calcLength(v, size, Var(int));

Note: this is a simple example. In my actual problem, I have several functions like calcLength , and they are large, I don't want the compiler to inline them.


Note2: I'm open to different approaches as well. Basically, I'd like to have a solution, which fulfills these:

  • the algorithm is specified once (most likely in a template function)
  • if I specify 1 as inc , a special function instantiated, and the code most likely gets vectorized
  • if inc is not a compile-time constant, a general function is called
  • otherwise (non-1 compile-time constant): doesn't matter which function is called

If the goal here is simply to optimize, rather than enable use in a compile-time context, you can give the compiler hints about your intent:

static float calcLength_inner(const float *v, int size, int inc) {
    float l = 0;

    for (int i=0; i<size*inc; i += inc) {
        l += v[i]*v[i];
    }
    return sqrt(l);
}

float calcLength(const float *v, int size, int inc) {
    if (inc == 1) {
        return calcLength_inner(v, size, inc);  // compiler knows inc == 1 here, and will optimize
    }
    else {
        return calcLength_inner(v, size, inc);
    }
}

From godbolt , you can see that calcLength_inner has been instantiated twice, both with and without the constant propagation.

This is a C trick (and is used extensively inside numpy), but you can write a simple wrapper to make it easier to use in c++:

// give the compiler a hint that it can optimize `f` with knowledge of `cond`
template<typename Func>
auto optimize_for(bool cond, Func&& f) {
    if (cond) {
        return std::forward<Func>(f)();
    }
    else {
        return std::forward<Func>(f)();
    }
}

float calcLength(const float *v, int size, int inc) {
    return optimize_for(inc == 1, [&]{
        float l = 0;
        for (int i=0; i<size*inc; i += inc) {
            l += v[i]*v[i];
        }
        return sqrt(l);
    });
}

C++ does not provide a way to detect whether a supplied function parameter is a constant expression or not, so you cannot automatically differentiate between supplied literals and runtime values.

If the parameter must be a function parameter, and you're not willing to change the way it is called in the two cases, then the only lever you have here is the type of the parameter: your suggestions for Constant<1>() vs Var(inc) are pretty good in that regard.

Option 1: Trust you compiler (aka do nothing)

Can compilers can do what you want without you lifting a finger (well, you need to enable optimized builds, but that goes without saying).

Compilers can create what are called "function clones", which do what you want. A clone function is a copy of a function used for constant propagation, aka the resulting assembly of a function called with constant arguments. I found little documentation about this feature, so it's up to you if you want to rely on it.

The compiler can inline this function altogether, potentially making your problem a non-problem (you can help it by defining it inline in a header, using lto and/or using compiler specific attributes like __attribute__((always_inline)) )

Now, I am not preaching to let the compiler do its job. Although the compiler optimizations are amazing these times and the rule of thumb is to trust the optimizer, there are situation where you need to manually intervene. I am just saying to be aware of and to take it into consideration. Oh, and as always measure, measure, measure when it comes to performance, don't use your "I feel here I need to optimize" gut.

Option 2: Two overloads

float calcLength(const float *v, int size, int inc) {
    float l = 0;

    for (int i=0; i<size*inc; i += inc) {
        l += v[i]*v[i];
    }
    return sqrt(l);
}

template <int Inc>
float calcLength(const float *v, int size) {
    float l = 0;

    for (int i=0; i<size*inc; i += inc) {
        l += v[i]*v[i];
    }
    return sqrt(l);
}

The disadvantage here is code duplication, ofc. Also little care need to be taken at the call site:

calcLength(v, size, inc); // ok
calcLength<1>(v, size);   // ok
calcLength(v, size, 1);   // nope

Option 3: Your version

Your version is ok.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM