如何为数组中的元素锁定MUTEX，而不是为完整的数组锁定

Question

Short version of the question: I have 2 functions that share the same array, when one is editing it, the other is reading it. 问题的简短版本：我有2个共享同一个数组的函数，当一个编辑它时，另一个是读它。 However, the vector is long (5000 samples) and concurrent access rarely happens. 但是，向量很长（5000个样本），很少发生并发访问。 but the Mutex contention on MUTEX1 is slowing down the program. 但MUTEX1上的Mutex争用正在减慢程序的速度。 ' “

How can I lock certain locations of the memory instead of the complete block in order to reduce contention? 如何锁定内存的某些位置而不是完整的块以减少争用？

EDIT: Note: I have to use updated G values whenever possible. 编辑：注意：我必须尽可能使用更新的G值。

EDIT2: For example I have array G of length 5000. foo1 locks mutex1 to edit index 124. Although foo2 wants to edit index 2349, it cannot until foo1 releases mutex1 . EDIT2：例如，我有长度5000的排列G foo1锁mutex1虽然编辑索引124 foo2希望编辑索引2349，它不能直到foo1释放mutex1 。

is there a way I can move the contention of locking a mutex down to the element level? 有没有办法可以将锁定互斥锁的争用转移到元素级别？ meaning: I want foo2 and foo1 to only contest on a the same mutex, only when they want to edit the same index. 意思是：我希望foo2和foo1只在同一个互斥foo1竞争，只有当他们想要编辑相同的索引时。 Eg: foo1 wants to edit index 3156, and foo2 wants to edit index 3156. 例如： foo1想要编辑索引3156，而foo2想要编辑索引3156。

Long version with code explanation: I am writing a code for a complex mathematical function, and I am using pthreads to parallel the code and enhance the performance. 带代码说明的长版本：我正在为复杂的数学函数编写代码，我使用pthreads来并行代码并提高性能。 The code is very complex and I can post it but I can post a model to the code. 代码非常复杂，我可以发布它，但我可以将模型发布到代码中。

Basically I have 2 arrays that I want to edit using 2 threads that run in parallel. 基本上我有2个数组，我想用2个并行运行的线程编辑。 One thread runs foo1 and the other runs foo2 . 一个线程运行foo1 ，另一个运行foo2 。 However, they should run in a particular sequence and I use mutex es( _B , _A1 , and _A2 ) to grantee the sequence. 但是，它们应该以特定的顺序运行，并使用mutex （ _B ， _A1和_A2 ）来控制序列。 it goes as follows : 它如下：

foo1 (first half)
foo2 (first half) and foo1 (second half) (in parallel)
foo1 (first half) and foo2 (second half) (in parallel)
...
foo2(second half)

then I would retrieve my results. 然后我会检索我的结果。 In the first half of foo1 I will be using results in G1 that is might be edited at the same time by foo2 . 在foo1的前半部分，我将使用G1中可能由foo2同时编辑的结果。 Therefore I use Mutex1 to protect it. 因此我使用Mutex1来保护它。 same happens in foo2 for G . 同样发生在foo2用于G 。 However, locking the complete vector for 1 value is very in efficient, they nearly never edit the same memory location at the same time. 但是，将完整向量锁定为1值非常有效，它们几乎不会同时编辑相同的内存位置。 when I compare the results, it is almost always the same. 当我比较结果时，它几乎总是一样的。 I would like a way to lock one element at a time, so that they only contest the same element. 我想要一种方法一次锁定一个元素，这样他们只会竞争相同的元素。

I will describe the code for people interested to know how it works: 我将为有兴趣知道它如何工作的人描述代码：

#include <pthread.h>
#include <iostream>

using namespace std;

#define numThreads 2
#define Length 10000

pthread_t threads[numThreads];

pthread_mutex_t mutex1   = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t Mutex_B  = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t Mutex_A1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t Mutex_A2 = PTHREAD_MUTEX_INITIALIZER;

struct data_pointers
{
    double  *A;
    double  *B;
    double  *G;
    double  *L;
    int idxThread;
};

void foo1   (data_pointers &data);
void foo2   (data_pointers &data);

void *thread_func(void *arg){
    data_pointers data = *((data_pointers *) arg);
    if (data.idxThread==0)
        foo1 (data);
    else
        foo2 (data);
}

Up to here it is definitions and thread calling function, bare in mind that I define Length 10000 and numThreads 2 到目前为止，它是定义和线程调用函数，我记得我定义了Length 10000和numThreads 2

void foo1 ( data_pointers &data)
{
    double *A           = data.A;
    double *L           = data.L; 
    double *G           = data.G; 
    double U;

    for (int ijk =0;ijk<5;ijk++){
        /* here goes some definitions*/

        pthread_mutex_lock(&Mutex_A1);

        for (int k =0;k<Length;k++){
            pthread_mutex_lock(&mutex1); 
            U = G[k];
            pthread_mutex_unlock(&mutex1);
            /*U undergoes a lot of mathematical operations here


            */
        }

        pthread_mutex_lock(&Mutex_B);
        pthread_mutex_unlock(&Mutex_A2);
        for (int k =0;k<Length;k++){
            /*U another mathematical operations here


            */
            pthread_mutex_lock(&mutex1);
            L[k] = U;
            pthread_mutex_unlock(&mutex1);
            pthread_mutex_unlock(&Mutex_B);
        }
    }
}

in foo1 I lock mutexA1 and complete my work, then I lock MutexB and unlock MutexA2 so foo2 can start working. 在foo1中我锁定了mutexA1并完成了我的工作，然后我锁定了MutexB并解锁了MutexA2因此foo2可以开始工作。 Note that main starts by locking MutexA2 . 请注意， main通过锁定MutexA2开始。 This way I garantee foo1 started second half with mutexB locked, this way, foo2 cannot enter the second half of the function until foo1 unlocks mutexB 这样我保证foo1开始下半场用mutexB锁定，这样一来， foo2无法进入函数的下半部分直到foo1解锁mutexB

void foo2 (data_pointers &data)
{
    double *A           = data.A;
    double *L           = data.L; 
    double *G           = data.G; 
    double U;

    for (int ijk =0;ijk<5;ijk++){
        /* here goes some definitions*/

        pthread_mutex_lock(&Mutex_A1);

        for (int k =0;k<Length;k++){
            pthread_mutex_lock(&mutex1); 
            U = G[k];
            pthread_mutex_unlock(&mutex1);
            /*U undergoes a lot of mathematical operations here


            */
        }

        pthread_mutex_lock(&Mutex_B);
        pthread_mutex_unlock(&Mutex_A2);
        for (int k =0;k<Length;k++){        
            /*U another mathematical operations here


            */
            pthread_mutex_lock(&mutex1);
            L[k] = U;
            pthread_mutex_unlock(&mutex1);
            pthread_mutex_unlock(&Mutex_B);

        }
    }
}

Now, when foo1 unlocks mutexB it will have to wait for foo2 to unlock mutexA1 so it can work, foo2 will only unlock mutexA2 when it already unlocked mutexB . 现在，当foo1解锁mutexB它必须等待foo2解锁mutexA1以便它可以正常工作， foo2只会解锁mutexA2因为它已经解锁了mutexB 。

this goes on and on 5 times. 这种情况持续了5次。

int main(){
    double G1[Length];
    double G2[Length];
    double B1[Length];
    double B2[Length];
    double A2[Length];
    double A1[Length];
    data_pointers data[numThreads];

    data[0].L           = G2;
    data[0].G           = G1;   
    data[0].A           = A1;
    data[0].B           = B1;
    data[0].idxThread   = 0;

    data[1].L           = G1;
    data[1].G           = G2;   
    data[1].A           = A2;
    data[1].B           = B2;
    data[1].idxThread   = 1;

    pthread_mutex_lock(&Mutex_A2);

    pthread_create(&(threads[0]), NULL, thread_func, (void *) &(data[0]));
    pthread_create(&(threads[1]), NULL, thread_func, (void *) &(data[1]));
    pthread_join(threads[1], NULL);
    pthread_join(threads[0], NULL);

    pthread_mutex_unlock(&Mutex_A1);
    pthread_mutex_unlock(&Mutex_A2);

    return 0;
}

note this is only an example code. 请注意，这只是一个示例代码。 compiles and works as intended, but with no output. 编译并按预期工作，但没有输出。

LAST EDIT: Thank you all for the great ideas, I had a lot of experience, and fun following those suggestions. 最后编辑：谢谢大家的好主意，我有很多经验，并乐于遵循这些建议。 I will up vote all answers as they were useful, and pick the closest to the original question (atomicity) 我将对所有答案进行投票，因为它们很有用，并选择最接近原始问题（原子性）

Answer 1

If you do not resize your arrays, you do not need any mutexes on individual elements or whole array. 如果不调整数组大小，则不需要在单个元素或整个数组上使用任何互斥锁。

Read your values atomically, write your values atomically and stay calm. 原子地阅读你的价值观，原子地写下你的价值观并保持冷静。

Answer 2

If you want high-performance multi-thread access to an array-like data structure without using mutex, you could investigate compare-and-swap. 如果您希望在不使用互斥锁的情况下对类似数组的数据结构进行高性能多线程访问，则可以研究比较和交换。 Maybe you can design a lock-free data structure that will work for your specific problem. 也许您可以设计一个适用于您特定问题的无锁数据结构。 https://en.wikipedia.org/wiki/Compare-and-swap https://en.wikipedia.org/wiki/Compare-and-swap

Regarding the posted code, it seems you are complicating matters a bit too much. 关于发布的代码，似乎你的问题有点太复杂了。 If you want to achieve: 如果你想实现：

foo1 (first half)
foo2 (first half) and foo1 (second half) (in parallel)
foo1 (first half) and foo2 (second half) (in parallel)
...
foo2(second half)

two mutxes should do. 两个mutx应该做。

Maybe this could do. 也许这可以。 Some pseudo-code below: 下面有一些伪代码：

// These global variables controls which thread is allowed to
// execute first and second half.
// 1 --> Foo1 may run
// 2 --> Foo2 may run
int accessFirstHalf = 1;
int accessSecondHalf = 1;

void foo1 ( data_pointers &data)
{
    while(YOU_LIKE_TO_GO_ON)
    {
        while (true)
        {
            TAKE_MUTEX_FIRST_HALF;
            if (accessFirstHalf == 1)
            {
                RELEASE_MUTEX_FIRST_HALF;
                break;
            }
            RELEASE_MUTEX_FIRST_HALF;
            pthread_yield();
        }

        // Do the first half

        TAKE_MUTEX_FIRST_HALF;
        // Allow Foo2 to do first half
        accessFirstHalf == 2;
        RELEASE_MUTEX_FIRST_HALF;

        while (true)
        {
            TAKE_MUTEX_SECOND_HALF;
            if (accessSecondHalf == 1)
            {
                RELEASE_MUTEX_SECOND_HALF;
                break;
            }
            RELEASE_MUTEX_SECOND_HALF;
            pthread_yield();
        }

        // Do the second half

        TAKE_MUTEX_SECOND_HALF;
        // Allow Foo2 to do second half
        accessSecondHalf == 2;
        RELEASE_MUTEX_SECOND_HALF;
    }
}


void foo2 ( data_pointers &data)
{
    while(YOU_LIKE_TO_GO_ON)
    {
        while (true)
        {
            TAKE_MUTEX_FIRST_HALF;
            if (accessFirstHalf == 2)
            {
                RELEASE_MUTEX_FIRST_HALF;
                break;
            }
            RELEASE_MUTEX_FIRST_HALF;
            pthread_yield();
        }

        // Do the first half

        TAKE_MUTEX_FIRST_HALF;
        // Allow Foo1 to do first half
        accessFirstHalf == 1;
        RELEASE_MUTEX_FIRST_HALF;

        while (true)
        {
            TAKE_MUTEX_SECOND_HALF;
            if (accessSecondHalf == 2)
            {
                RELEASE_MUTEX_SECOND_HALF;
                break;
            }
            RELEASE_MUTEX_SECOND_HALF;
            pthread_yield();
        }

        // Do the second half

        TAKE_MUTEX_SECOND_HALF;
        // Allow Foo1 to do second half
        accessSecondHalf == 1;
        RELEASE_MUTEX_SECOND_HALF;
    }
}


int main()
{
    // start the threads with foo1 and foo2
}

Answer 3

Sample code of using an atomic pointer to 'lock' certain locations in memory: 使用原子指针“锁定”内存中某些位置的示例代码：

#include <vector>
#include <atomic>
#include <thread>

using container = std::vector<std::atomic<double>>;
using container_size_type = container::size_type;

container c(300);

std::atomic<container::pointer> p_busy_elem{ nullptr };

void editor()
{
    for (container_size_type i{ 0 }, sz{ c.size() }; i < sz; ++i)
    {
        p_busy_elem.exchange(&c[i]); // c[i] is busy
        // ... edit c[i] ... // E: calculate a value and assign it to c[i]
        p_busy_elem.exchange(nullptr); // c[i] is no longer busy
    }
}

void reader()
{
    for (container_size_type i{ 0 }, sz{ c.size() }; i < sz; ++i)
    {
        // A1: wait for editor thread to finish editing value
        while (p_busy_elem == &c[i])
        {
            // A2: room a better algorithm to prevent blocking/yielding
            std::this_thread::yield();
        }

        // B: if c[i] is updated in between A and B, this will load the latest value
        auto value = c[i].load();

        // C: c[i] might have changed by this time, but we had the most up to date value we could get without checking again
        // ... use value ...
    }
}

int main()
{
    std::thread t_editor{ editor };
    std::thread t_reader{ reader };
    t_editor.join();
    t_reader.join();
}

In the editor thread, the busy pointer is set to indicate that that memory location is currently being edited ( E ). 在编辑器线程中，忙指针被设置为指示当前正在编辑该存储器位置（ E ）。 If thread B attempts to read that value after the busy pointer is set, it will wait until the editing is done before proceeding ( A1 ). 如果线程B在设置忙指针后尝试读取该值，它将等到编辑完成后再继续（ A1 ）。

Note on A2 : A better system could be placed here. 关于A2的注释：可以在这里放置更好的系统。 A list of nodes that were busy when a read was attempted could be kept, we would then add i to that list and attempt to process the list at a later time. 可以保留尝试读取时忙碌的节点列表，然后我们将i添加到该列表并尝试稍后处理列表。 Benefit: the loop could be told to execute a continue and indices past the currently being edited i would be read. 好处：循环可能被告知执行continue和指数过去当前正在编辑i会被读取。

A copy of the value to read is made ( B ) in order to use it ( C ) however needed. 要读取的值的副本（ B ），以便使用它（ C ），但需要。 This is the last time we can check for the latest value at c[i] . 这是我们最后一次检查c[i]的最新值。

Answer 4

This seems to be the heart of your requirement: 这似乎是您要求的核心：

 foo1 (first half) foo2 (first half) and foo1 (second half) (in parallel) foo1 (first half) and foo2 (second half) (in parallel) ... foo2(second half)

The easiest way to achieve this interleaving with pthreads is to use barriers. 实现与pthreads交错的最简单方法是使用障碍。

Initialise a barrier with pthread_barrier_init() using a count of 2. foo1() then executes: 使用pthread_barrier_init()使用count 2初始化障碍foo1()然后执行：

first half
pthread_barrier_wait()
second half
pthread_barrier_wait()
...
first half
pthread_barrier_wait()
second half
pthread_barrier_wait()

and foo2() executes a slightly different sequence: 和foo2()执行一个稍微不同的序列：

pthread_barrier_wait()
first half
pthread_barrier_wait()
second half
....
pthread_barrier_wait()
first half
pthread_barrier_wait()
second half

如何为数组中的元素锁定MUTEX，而不是为完整的数组锁定

问题描述

4 个解决方案

解决方案1
1 2015-09-17 17:10:37

解决方案2
1 2015-09-17 17:12:47

解决方案3
1 已采纳 2015-09-17 21:32:27

解决方案4
1 2015-09-18 00:33:53

如何为数组中的元素锁定MUTEX，而不是为完整的数组锁定

问题描述

4 个解决方案

解决方案1 1 2015-09-17 17:10:37

解决方案2 1 2015-09-17 17:12:47

解决方案3 1 已采纳 2015-09-17 21:32:27

解决方案4 1 2015-09-18 00:33:53

解决方案1
1 2015-09-17 17:10:37

解决方案2
1 2015-09-17 17:12:47

解决方案3
1 已采纳 2015-09-17 21:32:27

解决方案4
1 2015-09-18 00:33:53