Adjusting granularity in tbb parallel_pipeline

Question

Task for pipeline is following:

read sequentially huge(10-15k) amount of ~100-200 Mb compressed files
decompress each file in parallel
deserialize each decompressed file in parallel
process result deserialized objects and get some values based on all objects (mean, median, grouppings etc.)

When I get decompressed file memory buffer, serialized blocks go one after one, so I'd like to pass them to the next filter in the same manner or, at least, adjust this process by packing serialized blocks in groups of some number and then pass. However (as I understand it) tbb_pipeline makes me pass pointer to buffer with ALL serialized blocks because each filter has to get pointer and return pointer.

Using concurrent queue to accumulate packs of serialized objects kills matter of using tbb_pipeline, as I understand. Moreover, constness of operator() in filters doesn't allow to have my own intermediate "task pool"(but nevertheless if each thread had its own local copy of storage for "tasks" and just cut right pieces from it, it would be great)

Primary question: Is there some way to "adjust" granularity in this situation? (ie some filter gets pointer to all serialized objects and passes to the next filter small pack of objects)

Reformatting(splitting etc.) input files is almost impossible.

Secondary question: When I accumulate processing results, I don't really care about any kind of order, I need only aggregating statistics. Can I use parallel filter instead of serial_out_of_order and accumulate results of processing for each thread somewhere, and then just merge them?

Answer 1

However (as I understand it) tbb_pipeline makes me pass pointer to buffer with ALL serialized blocks because each filter has to get pointer and return pointer.

First I think, it's better to use more modern, type-safe form of the pipeline: parallel_pipeline . It does not prescribe you to pass any specific pointer of any specific data. You just specify which data of which type is needed for the next stage to be able to process it. So it's rather a matter of how your first filter partitions the data to be processed by the following filters.

Primary question : Is there some way to "adjust" granularity in this situation? (ie some filter gets pointer to all serialized objects and passes to the next filter small pack of objects)

You can safely embed one parallel algorithm into another in order to change the granularity for some stages, eg on the top level, 1st pipeline goes through the file list; 2nd pipeline reads big blocks of the file on the nested level; and finally, the innermost pipeline breaks down the big blocks to smaller ones for some of the 2nd level stages. See a general example of nesting below.

Secondary question : Can I use parallel filter instead of serial_out_of_order and accumulate results of processing for each thread somewhere, and then just merge them?

Yes, you can always use a parallel filter if it does not modify a shared data. For example, you can use tbb::combinable in order to collect thread-specific partial sums and then combine them.

but nevertheless if each thread had its own local copy of storage for "tasks" and just cut right pieces from it, it would be great

yes, they have. Each thread has its own local pool of tasks.

General example of nested parallel_pipelines

parallel_pipeline( 2/*only two files at once*/,
    make_filter<void,std::string>(
        filter::serial,
        [&](flow_control& fc)-> std::string {
            if( !files.empty() ) {
                std::string filename = files.front();
                files.pop();
                return filename;
             } else {
                fc.stop();
                return "stop";
            }
        }    
    ) &
    make_filter<std::string,void>(
        filter::parallel,
        [](std::string s) {

            // a nested pipeline
            parallel_pipeline( 1024/*only two files at once*/,
                make_filter<void,char>(
                    filter::serial,
                    [&s](flow_control& fc)-> char {
                        if( !s.empty() ) {
                            char c = s.back();
                            s.pop_back();
                            return c;
                         } else {
                            fc.stop();
                            return 0;
                        }
                    }    
                ) &
                make_filter<char,void>(
                    filter::parallel,
                    [](char c) {
                        putc(c, stdout);
                    } 
                )
            );
        } 
    )
);

Adjusting granularity in tbb parallel_pipeline

Question

1 answers

solution1
2 ACCPTED 2014-08-05 16:46:22

Adjusting granularity in tbb parallel_pipeline

Question

1 answers

solution1 2 ACCPTED 2014-08-05 16:46:22

solution1
2 ACCPTED 2014-08-05 16:46:22