简体   繁体   English

C++ 无锁线程问题 - 多个线程迭代连续数组但从不访问相同的成员数据?

[英]C++ Lockless Threading Question - Multiple threads iterating over a contiguous array but never accessing same member data?

In my C++ game engine, I have a job system which utilizes worker threads to do various tasks.在我的 C++ 游戏引擎中,我有一个利用工作线程执行各种任务的作业系统。 Threads are affinitized to each available core.线程关联到每个可用的核心。 Recently, I have been trying to optimize some of my system pipelines by maximizing CPU utilization.最近,我一直在尝试通过最大化 CPU 利用率来优化我的一些系统管道。 Here is some example pseudo-ish code.这是一些示例伪代码。 It isn't an exact replica but the situation is similar.它不是一个精确的复制品,但情况相似。

struct entityState {
  uint8 * byteBuffer; // Serialized binary data for the Entity
  uint8 * compressedData; // Compressed version of Entity data
  uint64  guid; // Unique ID
  gameTimeMS lastUpdated; // last time buffer was updated in milliseconds
  uint32 numUpdates; // Count of the number of updates
  uint32 numTimesAckedOverNetwork; // How many times client acked the data
  const char * typeData; // Type data in place of RTT
  bool markedForDelete; // Whether this object should be deleted next frame
  const char * debugData; // In debug configs, store meta data 
  // More member data but the point is made
};

// For examples sake, I have a contiguous array of entityState data
List< entityState * > entityStateList;
PopulateListWithEntityStateData(); // ~20,000 entityState ptrs on average
SortEntityStateList();
// Fire off 5 jobs each with their own worker thread
StartEntityStateJobs();

I then have 5 jobs that operate on this list at the same time with no Mutexes or Critical Sections .然后,我有 5 个作业同时在此列表上运行,没有MutexesCritical Sections Each job function accesses the array via binary search based on a criteria, such as a guid, or just a linear search.每个作业 function 基于一个条件(例如 guid)或只是线性搜索通过二分搜索访问数组。 Here is the catch.这是问题所在。 None of the job functions modify the same member data of the entityState ptrs in the entityStateList .没有任何工作职能修改 entityStateList 中entityState ptrs 的相同成员数据 However, they can deference the same entityState ptr due to the binary search vs linear search having collisions.但是,由于二分搜索与线性搜索存在冲突,它们可以遵循相同的 entityState ptr。 But, I repeat, they never modify the same member data at the same time.但是,我再说一遍,他们从不同时修改相同的成员数据。 No member data ptrs are dereferenced at the same time on each thread.没有成员数据ptrs 在每个线程上同时被取消引用。

I have run this simulation with a unit test and encountered no issues.我已经用单元测试运行了这个模拟并且没有遇到任何问题。 However, I have some programmer friends who say there is a very very small probability this will cause undefined behavior with threads pausing and resuming when dereferencing the same entityStatePtr.但是,我有一些程序员朋友说,当取消引用同一个 entityStatePtr 时,这将导致线程暂停和恢复的未定义行为的可能性非常小。

The other point I have heard is that the reason this setup has worked, is that the entityState struct size does not fit in a cache line and ends up dividing the data fetching, which in of itself, acts as data protection itself due to the struct data being separated into different cache lines.我听说的另一点是此设置起作用的原因是 entityState 结构大小不适合缓存行并最终划分数据获取,由于结构本身,它本身充当数据保护数据被分成不同的缓存行。 To clarify, let's say the top half fits in one cache line and the bottom half in another and the job functions only operate on one data member of the entityState ptr and the majority of the time it happens to be on a different cache line.澄清一下,假设上半部分适合一个缓存行,下半部分适合另一个缓存行,并且作业函数仅对 entityState ptr 的一个数据成员进行操作,并且大部分时间它恰好位于不同的缓存行上。 I do not use any atomic modifiers or operations on the member data because no jobs touch the same member data.我不对成员数据使用任何原子修饰符或操作,因为没有作业涉及相同的成员数据。

Lastly, I also have some programmer friends who say this is perfectly thread safe.最后,我也有一些程序员朋友说这是完全线程安全的。

Nevertheless, I have three different statements and my low level knowledge is lacking enough about multi-threading to ascertain which is claim is correct.尽管如此,我有三种不同的陈述,而且我对多线程的低级知识缺乏足够的知识来确定哪个是正确的。

The question is... is it possible for a super low crash that could happen in the wild 1 out of 'x' times?问题是......是否有可能在“x”次中发生一次超低崩溃? Even 1/1million is not acceptable.即使是 1/100 万也是不可接受的。 Is this a safe, lockless threading mechanism to perform multiple operations on the list in parallel?这是一种安全、无锁的线程机制,可以在列表上并行执行多个操作吗? Try to overlook the triviality of the example data.尝试忽略示例数据的琐碎性。 It is much more complex in my engine example.在我的引擎示例中,它要复杂得多。 This code can run on multiple OS, such as PC, Linux, and consoles.此代码可以在多个操作系统上运行,例如 PC、Linux 和控制台。 It has yet to crash but the exposure and testing is limited.它尚未崩溃,但曝光和测试是有限的。 I admit I am not a low level expert, but this is saving precious performance time.我承认我不是低级专家,但这节省了宝贵的表演时间。 So, am I waiting to run into a land mine or is this safe?那么,我是在等着撞上地雷还是这样安全? Compiler is gcc version C++11.编译器为 gcc 版本 C++11。 Also, please avoid the performance topic of locality unless its related to threading and or thread safety.另外,请避免局部性的性能话题,除非它与线程和/或线程安全有关。 I know cache misses are bad.我知道缓存未命中很糟糕。

The Question - Is is thread safe or not?问题- 线程是否安全? If yes or no please explain why in detail if possible.如果是或否,请尽可能详细解释原因。 I would like to bolster my low level knowledge.我想加强我的低级知识。

@walnut already explained in detail that "accessing different elements of an array is guaranteed to not cause data races". @walnut 已经详细解释了“保证访问数组的不同元素不会导致数据竞争”。

However, you mentioned that you have multiple job functions updating the entityState, and that these functions are ordered by some jobchain object.但是,您提到您有多个更新 entityState 的作业函数,并且这些函数由一些作业链 object 排序。 You did not go into detail about how this jobchain is implemented, but you have to ensure that it establishes a proper happens-before relation between the different job functions, otherwise you do have a data race on the entiyState members .您没有 go 详细介绍此作业链是如何实现的,但您必须确保它在不同的工作职能之间建立适当的先发生关系,否则您确实会在entiyState 成员上发生数据竞争。

And I also agree with @rustyx - run your code with ThreadSanitizer.我也同意@rustyx - 使用 ThreadSanitizer 运行您的代码。 It helps unveil a lot of threading issues, including data races.它有助于揭示许多线程问题,包括数据竞争。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM