简体   繁体   English

MFC多线程程序可以并行方式运行吗?

[英]Can MFC multithread program run in a parallel method?

I want to utilize multithread in MFC. 我想在MFC中利用多线程。 I'm doing a little experiment to see if the program runs in a parallel method. 我正在做一个小实验,看看程序是否以并行方法运行。 I write two thread function like this: 我编写了两个线程函数,如下所示:

UINT CMFCApplication2Dlg::thread01(LPVOID pParam)
{
    clock_t t1, t2;
    t1 = clock();
    for (int i = 0; i < 300000; i++)
        cout << "thread01111111111" << endl;

    t2 = clock();
    cout << "clock is " << t2 - t1 << endl;
    return 0;
}

UINT CMFCApplication2Dlg::thread02(LPVOID pParam)
{
    clock_t t1, t2;
    t1 = clock();

    for (int i = 0; i < 300000; i++)
        cout << "thread02222222222" << endl;


    t2 = clock();
    cout << "clock is " << t2 - t1 << endl;
    return 0;
}

and call them and output into a console window: 并调用它们并输出到控制台窗口:

AllocConsole();                     

    freopen("CONOUT$", "w+t", stdout);

    freopen("CONIN$", "r+t", stdin);  

    printf("Hello World!\n");         

    CWinThread *pThread01;
    CWinThread *pThread02;
    pThread01 = AfxBeginThread(thread01, this, 0, 0, 0, NULL);
    pThread02 = AfxBeginThread(thread02, this, 0, 0, 0, NULL);

When running two threads together, the count is 118020; 当同时运行两个线程时,计数为118020;否则,计数为110。 When running single thread, the count is 60315; 当运行单线程时,计数为60315; When put two loops in the same thread in a serial way, I get 102795. 当以串行方式将两个循环放在同一线程中时,我得到102795。

I used to think the compiler could optimize multiple-thread to execute in parallel automatically, but it seems like single core multithread concurrency does. 我曾经认为编译器可以优化多线程以自动并行执行,但是似乎单核多线程并发确实可以做到。 It doesn't reduce runtime. 它不会减少运行时间。 The CPU I used has 4 cores. 我使用的CPU有4个核心。 What should I do to run threads in different core parallel to achieve high performance? 我应该怎么做才能在不同内核中并行运行线程以实现高性能?

Both threads are trying to use a shared resource ( std::cout ) at the same time. 两个线程都试图同时使用共享资源( std::cout )。 The system has to serialize output at one point so most of the time one of the threads will wait for the other one to finish writing. 系统必须在一个点上对输出进行序列化,因此大多数情况下,一个线程将等待另一个线程完成写入。 This is called synchronization . 这称为同步 When you are using threads for performance improvements, you want to minimize the time spent for synchronization as much as possible, because during this time the threads can't do useful work. 当您使用线程来提高性能时,您希望尽可能减少同步所花费的时间,因为在这段时间内,线程无法完成有用的工作。

Try to replace cout in the inner loop by a lengthy calculation, and only use cout at the end to print the final result, so the compiler cannot optimize the calculation away (without cout it could, because the calculation would have no observable effect). 尝试更换cout通过一个漫长的计算内部循环,只有使用cout在最后打印的最终结果,所以编译器不能优化计算远(不cout它可以,因为计算将没有可观察到的效果)。

Also, std::clock lacks precision for profiling. 另外, std::clock缺乏分析精度。 I recommend to use std::chrono::high_resolution_clock instead which usually is implemented using QueryPerformanceCounter() on the Windows platform. 我建议使用std::chrono::high_resolution_clock代替,通常在Windows平台上使用QueryPerformanceCounter()实现。 This is the best you can get on Windows. 这是您在Windows上可以获得的最好的结果。

Try this: 尝试这个:

INT CMFCApplication2Dlg::thread01(LPVOID pParam)
{
    using myclock = std::chrono::high_resolution_clock;
    auto t1 = myclock::now();

    std::int64_t first = 0, second = 1, result = 0;
    for( std::int64_t i = 0; i < 10000000; ++i )
    {
         result = first + second;
         first = second;
         second = result;
    }

    auto t2 = myclock::now();   
    std::chrono::duration<double> td = t2 - t1;  // duration in seconds

    std::cout << "result is " << result << '\n'
              << "clock is " << std::fixed << std::setprecision( 6 ) << td.count() << " s" << std::endl;

    return 0;
}

Make sure the calculation is not too simple, because the optimizer is pretty clever and may turn your O(n) algorithm into O(1) for instance. 确保计算不是太简单,因为优化器非常聪明,并且可能会将您的O(n)算法转换为O(1)。 It may even do the entire calculation at compile time and only assign a constant at runtime. 它甚至可以在编译时进行整个计算,而仅在运行时分配一个常量。 To avoid that, you could read the number of loop iterations from cin instead. 为避免这种情况,您可以改为从cin读取循环迭代次数。 Though this wasn't necessary when testing the above code on MSVC 2017 even with full optimization. 尽管在完全优化的情况下在MSVC 2017上测试上述代码时这不是必需的。

Read about concurrency runtime. 了解有关并发运行时的信息。 It can help you without the headache: https://msdn.microsoft.com/en-us/library/dd504870.aspx 它可以帮助您,不用担心: https : //msdn.microsoft.com/en-us/library/dd504870.aspx

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM