简体   繁体   English

如何在 Qt C++ 中多线程读取、处理和写入数据?

[英]How to Multithread reading, processing and writing data in Qt C++?

I am relatively new to Qt and C++ and completely self-taught.我对 Qt 和 C++ 比较陌生,完全是自学的。 I'm trying to multi-thread a task which I currently have working on a single thread.我正在尝试对我目前在单个线程上工作的任务进行多线程处理。

My task is as follows:我的任务如下:

  1. Read in multiple csv files line by line and store each column of data from each file into separate vectors.逐行读取多个 csv 文件,并将每个文件中的每一列数据存储到单独的向量中。
  2. Process the index of each vector of data through various mathematical equations.通过各种数学方程处理每个数据向量的索引。
  3. Once each index of data is processed, write the results of the equations to an output file.处理完每个数据索引后,将方程的结果写入输出文件。

Example file being read in:正在读取的示例文件:

Col1,   Col2,   Col3,   Col4,  . . .  ColN
1,      A,      B,      C,     . . .  X
2,      D,      E,      F,     . . .  Y
3,      G,      H,      J,     . . .  Z
.,      .,      .,      .,     . . .  .
.,      .,      .,      .,     . . .  .
N,      .,      .,      .,     . . .  .

And here is some sudo code showing the principle:这是一些显示原理的sudo代码:

for (int i = 0; i < N; i = i + 1)
{
    // there are multiple nested for loops, but only one shown here

    // calculate multiple variables. Here are two examples:
    calculatedVariable = Col2[i] + Col3[i] / Col4 [i];
    calculatedVariable2 = (Col2[i] * 0.98) / (Col2[i] + Col3[i] + Col4[i]) + (Col2[i] + Col3[i])
    
    // then write the calculated variables to an output text file
    output << calculatedVariable << "," << calculatedVariable2 << std::endl;
}

This works great as the code writes to the output text file at the end of each loop iteration, and so it doesn't clog up RAM (ie instead of doing all computations, storing in vectors and then writing the data out all in one go).这很有效,因为代码在每次循环迭代结束时写入输出文本文件,因此它不会阻塞 RAM(即,不是进行所有计算,而是存储在向量中,然后一次性将数据全部写入)。

My problem is that these files can have hundreds of thousands of lines and processing can take a couple of hours.我的问题是这些文件可能有数十万行,处理可能需要几个小时。 If I can multi-thread, such that the processing is carried out for multiple indices of data simultaneously, while maintaining the order of data in the output file, it would drastically reduce computation time.如果我可以多线程,这样可以同时对多个数据索引进行处理,同时保持输出文件中数据的顺序,它将大大减少计算时间。 I don't need to multi-thread the reading of data at this stage.在这个阶段我不需要多线程读取数据。

I am currently struggling at the conceptual aspect of tackling this and can't find any similar examples online.我目前在解决这个问题的概念方面苦苦挣扎,在网上找不到任何类似的例子。 I've look at QtConcurrent as an option but not quite sure how to apply it.我已经将 QtConcurrent 视为一个选项,但不太确定如何应用它。

If anyone can point me in the right direction that would be appreciated.如果有人能指出我正确的方向,那将不胜感激。 Thank you.谢谢你。

EDIT 1: Thanks for the responses.编辑1:感谢您的回复。 So the bottle-neck is the actual processing of the data through some long iterative calculations, not the IO operations.所以瓶颈是通过一些长时间的迭代计算对数据的实际处理,而不是IO操作。 Lets say I read 2 files, each with 1000 lines.假设我读了 2 个文件,每个文件有 1000 行。 If I want to run some calculations for each line in file 1 for each line in file 2, that's 1,000,000 cases.如果我想为文件 2 中的每一行对文件 1 中的每一行进行一些计算,那就是 1,000,000 个案例。 If there was some way to split the task of those calculations across lets say 10 threads, that would cut processing time massively.如果有某种方法可以将这些计算的任务分成 10 个线程,那将大大缩短处理时间。

Basically, you want this.基本上,你想要这个。 Feel free to replace the std:: mechanisms below with their Qt equivalent (QString vs std::string, etc...)随意将下面的 std:: 机制替换为它们的 Qt 等价物(QString 与 std::string 等......)

struct Job
{
   std::string inputFileName;
   std::string outputFileName;
};

std::queue<Job> jobs;

// not shown - populate jobs with the input/output names of files you want to manage

std::mutex m;

unsigned int nthreads = std::thread::hardware_concurrency();
vector<std::thread> threads;
for (unsigned int i = 0; i < nthreads; i++) {
    std::thread t = [&m, &jobs] {

        while (true) {
            Job job;
            {
                std::lock_guard<std::mutex> lck(m); // acquire the lock that protects jobs
                if jobs.empty() {
                    return;  // the queue is empty, this thread can exit
                }
                job = jobs.front();
                jobs.pop();
            }
        
            // YOUR CODE GOES HERE
            // Open the file job.inputFileName and read in the contents
            // Open the output file job.outputFileName
            // then do your processing and write to the output file handle
            // close your files

            // all done for this file - loop back to the top of this lambda function and get the next file
        }
    };

    threads.push_back(std::move(t));
}

// wait for all threads to finish
for (auto& t : threads) {
    t.join();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM