哪些类型的 memory_order 应该用于具有 atomic_flag 的非阻塞行为？

Question

I'd like, instead of having my threads wait, doing nothing, for other threads to finish using data, to do something else in the meantime (like checking for input, or re-rendering the previous frame in the queue, and then returning to check to see if the other thread is done with its task).我不想让我的线程等待，什么也不做，让其他线程完成使用数据，同时做其他事情（比如检查输入，或者重新渲染队列中的前一帧，然后返回检查其他线程是否已完成其任务）。

I think this code that I've written does that, and it "seems" to work in the tests I've performed, but I don't really understand how std::memory_order_acquire and std::memory_order_clear work exactly, so I'd like some expert advice on if I'm using those correctly to achieve the behaviour I want.我认为我编写的这段代码可以做到这一点，并且它“似乎”在我执行的测试中工作，但我真的不明白 std::memory_order_acquire 和 std::memory_order_clear 是如何工作的，所以我如果我正确使用这些来实现我想要的行为，我想获得一些专家建议。

Also, I've never seen multithreading done this way before, which makes me a bit worried.另外，我以前从未见过这样的多线程，这让我有点担心。 Are there good reasons not to have a thread do other tasks instead of waiting?是否有充分的理由不让线程执行其他任务而不是等待？

/*test program
intended to test if atomic flags can be used to perform other tasks while shared
data is in use, instead of blocking

each thread enters the flag protected part of the loop 20 times before quitting
if the flag indicates that the if block is already in use, the thread is intended to
execute the code in the else block (only up to 5 times to avoid cluttering the output)

debug note: this doesn't work with std::cout because all the threads are using it at once
and it's not thread safe so it all gets garbled.  at least it didn't crash

real world usage
one thread renders and draws to the screen, while the other checks for input and
provides frameData for the renderer to use.  neither thread should ever block*/

#include <fstream>
#include <atomic>
#include <thread>
#include <string>

struct ThreadData {
    int numTimesToWriteToDebugIfBlockFile;
    int numTimesToWriteToDebugElseBlockFile;
};

class SharedData {
public:
    SharedData() {
        threadData = new ThreadData[10];
        for (int a = 0; a < 10; ++a) {
            threadData[a] = { 20, 5 };
        }
        flag.clear();
    }

    ~SharedData() {
        delete[] threadData;
    }

    void runThread(int threadID) {
        while (this->threadData[threadID].numTimesToWriteToDebugIfBlockFile > 0) {
            if (this->flag.test_and_set(std::memory_order_acquire)) {
                std::string fileName = "debugIfBlockOutputThread#";
                fileName += std::to_string(threadID);
                fileName += ".txt";
                std::ofstream writeFile(fileName.c_str(), std::ios::app);
                writeFile << threadID << ", running, output #" << this->threadData[threadID].numTimesToWriteToDebugIfBlockFile << std::endl;
                writeFile.close();
                writeFile.clear();
                this->threadData[threadID].numTimesToWriteToDebugIfBlockFile -= 1;
                this->flag.clear(std::memory_order_release);
            }
            else {
                if (this->threadData[threadID].numTimesToWriteToDebugElseBlockFile > 0) {
                    std::string fileName = "debugElseBlockOutputThread#";
                    fileName += std::to_string(threadID);
                    fileName += ".txt";
                    std::ofstream writeFile(fileName.c_str(), std::ios::app);
                    writeFile << threadID << ", standing by, output #" << this->threadData[threadID].numTimesToWriteToDebugElseBlockFile << std::endl;
                    writeFile.close();
                    writeFile.clear();
                    this->threadData[threadID].numTimesToWriteToDebugElseBlockFile -= 1;
                }
            }
        }
    }
private:
    ThreadData* threadData;
    std::atomic_flag flag;
};

void runThread(int threadID, SharedData* sharedData) {
    sharedData->runThread(threadID);
}

int main() {
    SharedData sharedData;
    std::thread thread[10];
    for (int a = 0; a < 10; ++a) {
        thread[a] = std::thread(runThread, a, &sharedData);
    }
    thread[0].join();
    thread[1].join();
    thread[2].join();
    thread[3].join();
    thread[4].join();
    thread[5].join();
    thread[6].join();
    thread[7].join();
    thread[8].join();
    thread[9].join();
    return 0;
}```

Answer 1

The memory ordering you're using here is correct.您在这里使用的 memory 订购是正确的。

The acquire memory order when you test and set your flag (to take your hand-written lock) has the effect, informally speaking, of preventing any memory accesses of the following code from becoming visible before the flag is tested.测试和设置标志（获取手写锁）时的acquire memory 命令具有非正式的效果，即防止在测试标志之前对以下代码的任何 memory 访问变得可见。 That's what you want, because you want to ensure that those accesses are effectively not done if the flag was already set.这就是您想要的，因为您希望确保在已设置标志的情况下不会有效地完成这些访问。 Likewise, the release order on the clear at the end prevents any of the preceding accesses from becoming visible after the clear, which is also what you need so that they only happen while the lock is held.同样，最后clear的release顺序可防止前面的任何访问在清除后变得可见，这也是您需要的，以便它们仅在持有锁时发生。

However, it's probably simpler to just use a std::mutex .但是，仅使用std::mutex可能更简单。 If you don't want to wait to take the lock, but instead do something else if you can't, that's what try_lock is for.如果您不想等待获得锁，而是在无法获得锁时做其他事情，这就是try_lock的用途。

class SharedData {
    // ...
private:
    std::mutex my_lock;
}
// ...
if (my_lock.try_lock()) {
    // lock was taken, proceed with critical section
    my_lock.unlock();
} else {
    // lock not taken, do non-critical work
}

This may have a bit more overhead, but avoids the need to think about atomicity and memory ordering.这可能会产生更多开销，但避免了考虑原子性和 memory 排序的需要。 It also gives you the option to easily do a blocking wait if that later becomes useful.如果稍后变得有用，它还为您提供了轻松进行阻塞等待的选项。 If you've designed your program around an atomic_flag and later find a situation where you must wait to take the lock, you may find yourself stuck with either spinning while continually retrying the lock (which is wasteful of CPU cycles), or something like std::this_thread::yield() , which may wait for longer than necessary after the lock is available.如果您围绕atomic_flag设计程序，后来发现必须等待获取锁的情况，您可能会发现自己陷入了在不断重试锁的同时旋转（这会浪费 CPU 周期），或者类似std::this_thread::yield() ，在锁可用后可能会等待比需要更长的时间。

It's true this pattern is somewhat unusual.确实，这种模式有些不寻常。 If there is always non-critical work to be done that doesn't need the lock, commonly you'd design your program to have a separate thread that just does the non-critical work continuously, and then the "critical" thread can just block as it waits for the lock.如果总是有不需要锁的非关键工作要做，通常你会设计你的程序有一个单独的线程来连续执行非关键工作，然后“关键”线程可以在等待锁时阻塞。

哪些类型的 memory_order 应该用于具有 atomic_flag 的非阻塞行为？

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-02-27 19:53:56

哪些类型的 memory_order 应该用于具有 atomic_flag 的非阻塞行为？

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-02-27 19:53:56

解决方案1
2 已采纳 2021-02-27 19:53:56