简体   繁体   English

处理与 tbb 顺序的大数据

[英]processing large data that sequential with tbb

I'm working on c++ app to process large amounts of quote data eg.我正在使用 c++ 应用程序来处理大量报价数据,例如。 (MSFT, AMZN, etc) with tbb. (MSFT、AMZN 等)与 tbb。 And was wondering how I would structure it.并且想知道我将如何构建它。 I'm been looking at parallel_for and pipeline and concurrent_queue.我一直在研究parallel_for 和pipeline 以及concurrent_queue。

The process would basically parse the data, process it and output to file.该过程基本上会解析数据,处理它和 output 到文件。 Parsing and processing can be done in parallel, but output should be in order for each symbol.解析和处理可以并行完成,但 output 应该为每个符号按顺序。

Eg. Input:
    - Msg #1 - AMZN #1
    - Msg #2 - AMZN #2
    - Msg #3 - IBM #1
    - Msg #4 - AMZN #3
    - Msg #5 - CSCO #1
    - Msg $6 - IBM #2

I would like to use lock-free solution or minimum locking, but it seems like I have keep in concurrent_queue to keep the order.我想使用无锁解决方案或最小锁定,但似乎我保留在 concurrent_queue 中以保持订单。

Any ideas would be helpful任何想法都会有所帮助

Thanks, David谢谢,大卫

If you use the pipeline pattern ( tbb::pipeline class or tbb::parallel_pipeline() function), you can use ordered filters to ensure the output will appear in exactly the same order as the input was received.如果使用流水线模式( tbb::pipeline class 或tbb::parallel_pipeline()函数),则可以使用有序过滤器来确保 output 以与接收到的输入完全相同的顺序出现。 And you will not need any locks in your code for ordering.而且您在订购时不需要任何密码锁。

Does your quote data either have a timestamp or a sequence number您的报价数据是否有时间戳或序列号
Otherwise add a sequence number from the producer thread and sort the data based on squence number after parsing it - the resorting can be done then either in a batch or just before the writing of the files.否则,从生产者线程添加一个序列号,并在解析后根据序列号对数据进行排序 - 然后可以在批处理中或在写入文件之前完成重新排序。

You can create an output structure (hash or list) where a key is a position of the displayed element (1st, 2nd, ...) and the value is the data to be displayed.您可以创建 output 结构(散列或列表),其中键是显示元素(第 1、2、...)的 position,值是要显示的数据。 Then when all the elements are ready, you can output the structure in the desired order.然后当所有的元素都准备好后,就可以按照想要的顺序构建output了。

This way you don't care about which thread finishes first.这样你就不用关心哪个线程先完成了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM