简体   繁体   中英

Using exprtk in a multithreaded program

I need to write a program in which string expressions are evaluated quite frequently. An example of an expression is below:

"x0*a0*a0+x1*a1+x2*a2+x3*a3+x4*a4....."

The expressions can be long and a string can contain multiple such expressions.

I wrote some test code using the C++ library exprtk.

vector<std::string> observation_functions;
vector<std::string> string_indices;


template<typename T>
float* get_observation(float* sing_j, float* zrlist, int num_functions,int num_variables)
{
    //omp_set_nested(1);

    float* results = (float*)malloc(sizeof(float)*num_functions);
    exprtk::symbol_table<float> symbol_table;

    exprtk::expression<T> expression;
    exprtk::parser<T> parser;
    int i;
    for( i = 0; i < num_variables; i++)
    {
            symbol_table.add_variable("x"+string_indices[i], sing_j[i]);
            symbol_table.add_variable("a"+string_indices[i], zrlist[i]);
    }

    expression.register_symbol_table(symbol_table);
    for(i = 0; i < num_functions; i++)
    {
            parser.compile(observation_functions[i],expression);
            results[i] = expression.value();
    }
    return results;
}



int main()
{

    for( int i = 0; i < 52; i++)
    {

    ostringstream s2;
    s2<<i;
    string_indices.push_back(s2.str());
    }



    string hfun ="x0*a0*a0+x1*a1+x2*a2+x3*a3+x4*a4+x5*a5+x6*a6+x7*a7+x8*a8+x9*a9+x10*a10+x11*a11+x12*a12+x13*a13+x14*a14+x15*a15+x16*a16+x17*a17+x18*a18+x19*a19+x20*a20+x21*a21+x22*a22+x23*a23+x24*a24+x25*a25+x26*a26+x27*a27+x28*a28+x29*a29+x30*a30+x31*a31+x32*a32+x33*a33+x34*a34+x35*a35+x36*a36+x37*a37+x38*a38+x39*a39+x40*a40+x41*a41+x42*a42+x43*a43+x44*a44+x45*a45+x46*a46+x47*a47+x48*a48+x49*a49+x50*a50+x51*a51 ";


    boost::split(observation_functions, hfun, boost::is_any_of(" "));
    float *a=(float*)malloc(52*sizeof(float));
    float* c=(float*)malloc(52*sizeof(float));

    struct timeval t0,t1;
    gettimeofday(&t0, 0);
    for(int j=0; j < 210; j++)
        #pragma omp parallel for schedule(static,1) num_threads(8)
        for(int i=0;i<104;i++)
            float* b =get_observation<float>(a,c,1,52);
    gettimeofday(&t1, 0);
    long elapsed = (t1.tv_sec-t0.tv_sec)*1000000 + t1.tv_usec-t0.tv_usec;
    cout<<"elapsed:"<<elapsed<<endl;

}   

Note that this is test code. In the actual, each thread is going to evaluate the expression with a different set of values. This code works fine, but I need to make it go faster.

Based on some other experiments, I found that I cannot share a single symbol table with multiple threads to compute a single expression faster. Sharing a symbol table among multiple threads led to memory corruption errors.

Can someone please provide some suggestions on how I could improve the performance.

Assume, you have N threads. Than, create N sets of exprtk -related objects (including symbol_table , expression , parser ) in the main program ( outside the function, and outside for loops).

You could use vector<> to store them: eg, for expression objects it would be vector<expression> expressions;

Then, pass the references to those objects, when calling your function,

for(int i=0;i<104;i++)
    get_observation<float>(expressions[i], more params here..)

Template function definition: template <typename T> T* get_observation(expression & exp, more params here..)

You could also create one set of objects and make other by copying, as Aloalo suggested .

PS Prefer to use smart pointers, https://stackoverflow.com/a/19042634 not to forget delete the memory allocated somewhere locally.

You can try building exprtk objects only once and make a copy of them for each thread. This should be faster if copying of exprtk objects is faster than constructing them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM