[英]Why does pthread_mutex_t segfault when trying to lock through shared memory from two different processes?
I wrote a super simple wrapper for a pthread_mutex_t meant to be used between two processes: 我为pthread_mutex_t写了一个超级简单的包装器,该包装器打算在两个进程之间使用:
//basic version just to test using it between two processes
struct MyLock
{
public:
MyLock() {
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ADAPTIVE_NP);
pthread_mutex_init(&lock, &attr);
}
~MyLock() {
pthread_mutex_destroy(&lock);
pthread_mutexattr_destroy(&attr);
}
lock() {
pthread_mutex_lock(&lock);
}
unlock() {
pthread_mutex_unlock(&lock);
}
private:
pthread_mutexattr_t attr;
pthread_mutex_t lock;
};
I am able to see this lock work fine between regular threads in a process but when I run process A which does the following in a shared memory region: 我能够看到此锁在进程中的常规线程之间正常工作,但是当我运行进程A时,该进程在共享内存区域中执行以下操作:
void* mem; //some shared memory from shm_open
MyLock* myLock = new(mem) MyLock;
//loop sleeping random amounts and calling ->lock and ->unlock
Then process B opens the shared memory object (verified by setting it with combinations of characters that it's the same region of memory) and does this: 然后,进程B打开共享内存对象(通过将其设置为与内存相同的区域的字符组合进行验证)并执行以下操作:
MyLock* myLock = reinterpret_cast<MyLock*>(mem);
//same loop for locking and unlocking as process A
but process B segfaults when trying to lock with the backtrace leading to pthread_mutex_lock() in libpthread.so.0 但是尝试使用回溯锁进行锁定时进程B segfaults导致libpthread.so.0中的pthread_mutex_lock()
What am I doing wrong? 我究竟做错了什么?
The backtrace I get from process B looks like this: 我从进程B获得的回溯看起来像这样:
in pthread_mutex_lock () from /lib64/libpthread.so.0
in MyLock::lock at MyLock.H:50
in Server::setUpSharedMemory at Server.C:59
in Server::Server at Server.C
in main.C:52
The call was the very first call to lock after reinterpret casting the memory into a MyLock*
. 在重新解释将内存转换为
MyLock*
之后,该调用是锁定的第一个调用。 If I dump the contents of MyLock in gdb in the crashing process I see: 如果在崩溃过程中将MyLock的内容转储到gdb中,我会看到:
{
attr = {
__size = "\003\000\000\200",
__align = -2147483645
},
lock = {
__data = {
__lock = 1
__count = 0,
__owner = 6742, //this is the lightweight process id of a thread in process A
__nusers = 1,
__kind = 131,
__spins = 0,
__list = {
__prev = 0x0,
__Next = 0x0
}
},
__size = "\001\000\000\000\000 //etc,
__align = 1
}
}
so it looks alright (looks like this in the other process gdb as well). 因此看起来还不错(在其他进程gdb中也是如此)。 I am compiling both applications together using no additional optimization flags either.
我正在将两个应用程序一起编译,也没有使用其他优化标志。
You didn't post the code to open and initialize a shared memory region but I suspect that part might be responsible for your problem. 您没有发布用于打开和初始化共享内存区域的代码,但我怀疑这可能是造成您问题的原因。
Because pthread_mutex_t
is much larger than "combination of characters," you should check your shm_open(3)
- ftruncate(2)
- mmap(2)
sequence with reading and writing a longer (~ KB) string. 因为
pthread_mutex_t
比“字符组合”大得多,所以您应该通过读写更长的(〜KB)字符串来检查shm_open(3)
ftruncate(2)
mmap(2)
序列。
Dont't forget to check both endpoints can really write to the shm region and the written data is really visible to the other side. 别忘了检查两个端点是否可以真正写入shm区域,并且写入的数据对另一端确实可见。
Process A: [open and initialize the shm]-[write AAA...AA]-[sleep 5 sec]-[read BBB...BB]-[close the thm] 流程A:[打开并初始化shm]-[写入AAA ... AA]-[睡眠5秒]-[读取BBB ... BB]-[关闭thm]
Process B: (a second or two later) [open the shm]-[read AAA...AA]-[write BBB...BB]-[close the thm] 流程B :(一两秒钟后)[打开shm]-[读取AAA ... AA]-[写入BBB ... BB]-[关闭thm]
I have a similar issue where the writer Process is root and the Readers Processes are regular users (case of a hardware daemon). 我有一个类似的问题,其中writer进程是root用户,而Readers进程是普通用户(硬件守护程序的情况)。 This would segfault in Readers as soon as any
pthread_mutex_lock()
or pthread_cond_wait()
and their unlock counterparts were called. 一旦调用任何
pthread_mutex_lock()
或pthread_cond_wait()
及其解锁对象,就会在Readers中出现段错误。
I solved it by modifying the SHM file permissions using an appropriated umask : 我通过使用适当的umask修改SHM文件权限来解决此问题:
Writer 作家
umask(!S_IRUSR|!S_IWUSR|!S_IRGRP|!S_IWGRP|!S_IROTH|!S_IWOTH);
FD=shm_open("the_SHM_file", O_CREAT|O_TRUNC|O_RDWR, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH);
ftruncate(FD, 28672);
SHM=mmap(0, 28672, PROT_READ|PROT_WRITE, MAP_SHARED, FD, 0);
Readers 读者群
FD=shm_open("the_SHM_file", O_RDWR, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH);
SHM=mmap(0, 28672, PROT_READ|PROT_WRITE, MAP_SHARED, A.FD, 0);
You don't say what OS you are using, but you don't check the return value of the pthread_mutexattr_setpshared
call. 您没有说您正在使用什么操作系统,但是您没有检查
pthread_mutexattr_setpshared
调用的返回值。 It's possible your OS does not support shared mutexes and this call is failing. 您的操作系统可能不支持共享互斥锁,并且此调用失败。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.