简体   繁体   English

如何将数据传递给正在运行的线程

[英]how to pass data to running thread

When using pthread, I can pass data at thread creation time.使用 pthread 时,我可以在线程创建时传递数据。

What is the proper way of passing new data to an already running thread?将新数据传递给已经运行的线程的正确方法是什么?

I'm considering making a global variable and make my thread read from that.我正在考虑制作一个全局变量并从中读取我的线程。

Thanks谢谢

That will certainly work.那肯定会奏效。 Basically, threads are just lightweight processes that share the same memory space.基本上,线程只是共享相同 memory 空间的轻量级进程。 Global variables, being in that memory space, are available to every thread.全局变量位于 memory 空间中,可用于每个线程。

The trick is not with the readers so much as the writers.诀窍不在于读者,而在于作家。 If you have a simple chunk of global memory, like an int , then assigning to that int will probably be safe.如果您有一个简单的全局 memory 块,例如int ,那么分配给该int可能是安全的。 Bt consider something a little more complicated, like a struct . Bt 考虑一些更复杂的东西,比如struct Just to be definite, let's say we have只是为了确定,假设我们有

struct S { int a; float b; } s1, s2;

Now s1,s2 are variables of type struct S .现在s1,s2struct S类型的变量。 We can initialize them我们可以初始化它们

s1 = { 42,  3.14f };

and we can assign them我们可以分配它们

s2 = s1;

But when we assign them the processor isn't guaranteed to complete the assignment to the whole struct in one step -- we say it's not atomic .但是当我们分配它们时,处理器不能保证一步完成对整个结构的分配——我们说它不是原子的。 So let's now imagine two threads:所以现在让我们想象两个线程:

thread 1:
   while (true){
      printf("{%d,%f}\n", s2.a, s2.b );
      sleep(1);
   }

thread 2:
   while(true){
      sleep(1);
      s2 = s1;
      s1.a += 1;
      s1.b += 3.14f ;
   }

We can see that we'd expect s2 to have the values {42, 3.14}, {43, 6.28}, {44, 9.42} ....我们可以看到我们希望s2具有值{42, 3.14}, {43, 6.28}, {44, 9.42} ...。

But what we see printed might be anything like但我们看到的印刷品可能类似于

 {42,3.14}
 {43,3.14}
 {43,6.28}

or或者

 {43,3.14}
 {44,6.28}

and so on.等等。 The problem is that thread 1 may get control and "look at" s2 at any time during that assignment.问题是线程 1 可能在该分配期间的任何时间获得控制权并“查看”s2。

The moral is that while global memory is a perfectly workable way to do it, you need to take into account the possibility that your threads will cross over one another.寓意是,虽然全局 memory 是一种完全可行的方法,但您需要考虑线程相互交叉的可能性。 There are several solutions to this, with the basic one being to use semaphores .有几种解决方案,基本的一个是使用semaphores A semaphore has two operations, confusingly named from Dutch as P and V .信号量有两个操作,从荷兰语中混淆地命名为PV

P simply waits until a variable is 0 and the goes on, adding 1 to the variable; P只是等待直到变量为 0 并继续,将变量加 1; V subtracts 1 from the variable. V 从变量中减去 1。 The only thing special is that they do this atomically -- they can't be interrupted.唯一特别的是它们以原子方式执行此操作——它们不能被打断。

Now, do you code as现在,你编码为

thread 1:
   while (true){
      P();
      printf("{%d,%f}\n", s2.a, s2.b );
      V();
      sleep(1);
   }

thread 2:
   while(true){
      sleep(1);
      P();
      s2 = s1;
      V();
      s1.a += 1;
      s1.b += 3.14f ;
   }

and you're guaranteed that you'll never have thread 2 half-completing an assignment while thread 1 is trying to print.并且您可以保证,当线程 1 尝试打印时,您永远不会让线程 2 完成任务的一半。

(Pthreads has semaphores, by the way.) (顺便说一下,Pthreads 有信号量。)

I have been using the message-passing, producer-consumer queue-based, comms mechanism, as suggested by asveikau, for decades without any problems specifically related to multiThreading.几十年来,我一直在使用 asveikau 建议的消息传递、基于生产者-消费者队列的通信机制,没有任何与多线程相关的问题。 There are some advantages:有一些优点:

1) The 'threadCommsClass' instances passed on the queue can often contain everything required for the thread to do its work - member/s for input data, member/s for output data, methods for the thread to call to do the work, somewhere to put any error/exception messages and a 'returnToSender(this)' event to call so returning everything to the requester by some thread-safe means that the worker thread does not need to know about. 1) 队列中传递的“threadCommsClass”实例通常可以包含线程完成工作所需的所有内容 - 输入数据的成员,output 数据的成员,线程调用的方法来完成工作,某处放置任何错误/异常消息和“returnToSender(this)”事件来调用,因此通过某种线程安全将所有内容返回给请求者意味着工作线程不需要知道。 The worker thread then runs asynchronously on one set of fully encapsulated data that requires no locking.然后,工作线程在一组完全封装的不需要锁定的数据上异步运行。 'returnToSender(this)' might queue the object onto a another PC queue, it might PostMessage it to a GUI thread, it might release the object back to a pool or just dispose() it. “returnToSender(this)”可能会将 object 排队到另一个 PC 队列中,它可能会将其 PostMessage 发送到 GUI 线程,它可能会将 object 释放回池或只是 dispose() 它。 Whatever it does, the worker thread does not need to know about it.无论它做什么,工作线程都不需要知道它。

2) There is no need for the requesting thread to know anything about which thread did the work - all the requestor needs is a queue to push on. 2) 请求线程不需要知道哪个线程完成了工作——请求者需要的只是一个推送队列。 In an extreme case, the worker thread on the other end of the queue might serialize the data and communicate it to another machine over a network, only calling returnToSender(this) when a network reply is received - the requestor does not need to know this detail - only that the work has been done.在极端情况下,队列另一端的工作线程可能会序列化数据并通过网络将其传递给另一台机器,仅在收到网络回复时调用 returnToSender(this) - 请求者不需要知道这一点细节 - 只有工作已经完成。

3) It is usually possible to arrange for the 'threadCommsClass' instances and the queues to outlive both the requester thread and the worker thread. 3) 通常可以安排'threadCommsClass' 实例和队列比请求者线程和工作线程的寿命都长。 This greatly eases those problems when the requester or worker are terminated and dispose()'d before the other - since they share no data directly, there can be no AV/whatever.当请求者或工作人员被终止并在另一个之前处理()时,这极大地缓解了这些问题 - 因为他们不直接共享数据,所以不可能有 AV/whatever。 This also blows away all those 'I can't stop my work thread because it's stuck on a blocking API' issues - why bother stopping it if it can be just orphaned and left to die with no possibility of writing to something that is freed?这也消除了所有那些“我无法停止我的工作线程,因为它卡在阻塞 API 上”的问题——如果它可能只是孤立的并且没有可能写入被释放的东西,为什么还要停止它呢?

4) A threadpool reduces to a one-line for loop that creates several work threads and passes them the same input queue. 4) 线程池简化为单行 for 循环,该循环创建多个工作线程并将它们传递给相同的输入队列。

5) Locking is restricted to the queues. 5) 锁定仅限于队列。 The more mutexes, condVars, critical-sections and other synchro locks there are in an app, the more difficult it is to control it all and the greater the chance of of an intermittent deadlock that is a nightmare to debug.应用程序中的互斥锁、condVars、临界区和其他同步锁越多,控制它们就越困难,出现间歇性死锁的可能性就越大,这是调试的噩梦。 With queued messages, (ideally), only the queue class has locks.对于排队的消息,(理想情况下),只有队列 class 有锁。 The queue class must work 100% with mutiple producers/consumers , but that's one class, not an app full of uncooordinated locking, (yech.).队列 class必须 100% 与多个生产者/消费者一起工作,但这是一个 class,而不是一个充满不协调锁定的应用程序,(yech.)。

6) A threadCommsClass can be raised anytime, anywhere, in any thread and pushed onto a queue. 6) 可以在任何时间、任何地点、任何线程中提出一个 threadCommsClass 并推送到一个队列中。 It's not even necessary for the requester code to do it directly, eg.请求者代码甚至不需要直接执行此操作,例如。 a call to a logger class method, 'myLogger.logString("Operation completed successfully");'调用记录器 class 方法,'myLogger.logString("操作成功完成");' could copy the string into a comms object, queue it up to the thread that performs the log write and return 'immediately'.可以将字符串复制到 comms object 中,将其排队到执行日志写入的线程并“立即”返回。 It is then up to the logger class thread to handle the log data when it dequeues it - it may write it to a log file, it may find after a minute that the log file is unreachable because of a network problem.然后由记录器 class 线程在它出列时处理日志数据 - 它可能会将其写入日志文件,一分钟后它可能会发现日志文件由于网络问题而无法访问。 It may decide that the log file is too big, archive it and start another one.它可能决定日志文件太大,将其归档并启动另一个。 It may write the string to disk and then PostMessage the threadCommsClass instance on to a GUI thread for display in a terminal window, whatever.它可以将字符串写入磁盘,然后将 threadCommsClass 实例 PostMessage 发送到 GUI 线程,以便在终端 window 中显示。 It doesn't matter to the log requesting thread, which just carries on, as do any other threads that have called for logging, without significant impact on performance.日志请求线程无关紧要,它只是继续进行,就像任何其他已调用日志记录的线程一样,对性能没有显着影响。

7) If you do need to kill of a thread waiting on a queue, rather than waiing for the OS to kill it on app close, just queue it a message telling it to teminate. 7)如果您确实需要终止在队列中等待的线程,而不是等待操作系统在应用程序关闭时终止它,只需将其排队一条消息告诉它终止即可。

There are surely disadvantages:肯定有缺点:

1) Shoving data directly into thread members, signaling it to run and waiting for it to finish is easier to understand and will be faster, assuming that the thread does not have to be created each time. 1) 将数据直接推入线程成员,发信号通知它运行并等待它完成更容易理解并且会更快,假设不必每次都创建线程。

2) Truly asynchronous operation, where the thread is queued some work and, sometime later, returns it by calling some event handler that has to communicate the results back, is more difficult to handle for developers used to single-threaded code and often requires state-machine type design where context data must be sent in the threadCommsClass so that the correct actions can be taken when the results come back. 2) 真正的异步操作,其中线程将一些工作排队,并在稍后的某个时间通过调用必须将结果返回的一些事件处理程序来返回它,对于习惯于单线程代码并且通常需要状态的开发人员来说更难处理- 必须在 threadCommsClass 中发送上下文数据的机器类型设计,以便在返回结果时可以采取正确的操作。 If there is the occasional case where the requestor just has to wait, it can send an event in the threadCommsClass that gets signaled by the returnToSender method, but this is obviously more complex than simply waiting on some thread handle for completion.如果偶尔出现请求者只需要等待的情况,它可以在 threadCommsClass 中发送一个由 returnToSender 方法发出信号的事件,但这显然比简单地等待某个线程句柄完成更复杂。

Whatever design is used, forget the simple global variables as other posters have said.无论使用什么设计,忘记其他海报所说的简单全局变量。 There is a case for some global types in thread comms - one I use very often is a thread-safe pool of threadCommsClass instances, (this is just a queue that gets pre-filled with objects).线程通信中有一些全局类型的情况——我经常使用的是线程安全的 threadCommsClass 实例池(这只是一个预先填充有对象的队列)。 Any thread that wishes to communicate has to get a threadCommsClass instance from the pool, load it up and queue it off.任何希望进行通信的线程都必须从池中获取一个 threadCommsClass 实例,将其加载并排队。 When the comms is done, the last thread to use it releases it back to the pool.通信完成后,最后一个使用它的线程将其释放回池中。 This approach prevents runaway new(), and allows me to easily monitor the pool level during testing without any complex memory-managers, (I usually dump the pool level to a status bar every second with a timer).这种方法可以防止 new() 失控,并允许我在测试期间轻松监控池级别,而无需任何复杂的内存管理器,(我通常使用计时器每秒将池级别转储到状态栏)。 Leaking objects, (level goes down), and double-released objects, (level goes up), are easily detected and so get fixed.泄漏的物体(水平下降)和双重释放的物体(水平上升)很容易检测到并因此得到修复。

MultiThreading can be safe and deliver scaleable, high-performance apps that are almost a pleasure to maintain/enhance, (almost:), but you have to lay off the simple globals - treat them like Tequila - quick and easy high for now but you just know they'll blow your head off tomorrow.多线程可以安全并提供可扩展的高性能应用程序,维护/增强几乎是一种乐趣,(几乎:),但您必须放弃简单的全局变量 - 像龙舌兰酒一样对待它们 - 现在快速简单,但你只知道他们明天会炸掉你的头。

Good luck!祝你好运!

Martin马丁

Global variables are bad to begin with, and even worse with multi-threaded programming.全局变量一开始就不好,多线程编程更糟。 Instead, the creator of the thread should allocate some sort of context object that's passed to pthread_create , which contains whatever buffers, locks, condition variables, queues, etc. are needed for passing information to and from the thread.相反,线程的创建者应该分配某种传递给pthread_create的上下文 object ,其中包含将信息传入和传出线程所需的任何缓冲区、锁、条件变量、队列等。

You will need to build this yourself.您需要自己构建它。 The most typical approach requires some cooperation from the other thread as it would be a bit of a weird interface to "interrupt" a running thread with some data and code to execute on it... That would also have some of the same trickiness as something like POSIX signals or IRQs, both of which it's easy to shoot yourself in the foot while processing, if you haven't carefully thought it through... (Simple example: You can't call malloc inside a signal handler because you might be interrupted in the middle of malloc , so you might crash while accessing malloc 's internal data structures which are only partially updated.)最典型的方法需要来自另一个线程的一些合作,因为它会有点奇怪的接口来“中断”正在运行的线程,并在其上执行一些数据和代码......这也会有一些相同的技巧诸如 POSIX 信号或 IRQ 之类的东西,如果您没有仔细考虑过,这两者在处理过程中很容易让自己陷入困境......(简单示例:您不能在信号处理程序中调用malloc ,因为您可能在malloc中间被中断,因此您可能会在访问malloc的内部数据结构时崩溃,这些数据结构仅部分更新。)

The typical approach is to have your thread creation routine basically be an event loop.典型的方法是让你的线程创建例程基本上是一个事件循环。 You can build a queue structure and pass that as the argument to the thread creation routine.您可以构建一个队列结构并将其作为参数传递给线程创建例程。 Then other threads can enqueue things and the thread's event loop will dequeue it and process the data.然后其他线程可以将事物入队,线程的事件循环会将其出队并处理数据。 Note this is cleaner than a global variable (or global queue) because it can scale to have multiple of these queues.请注意,这比全局变量(或全局队列)更干净,因为它可以扩展为拥有多个这样的队列。

You will need some synchronization on that queue data structure.您将需要对该队列数据结构进行一些同步。 Entire books could be written about how to implement your queue structure's synchronization, but the most simple thing would have a lock and a semaphore.可以写整本书来介绍如何实现队列结构的同步,但最简单的就是锁和信号量。 When modifying the queue, threads take a lock.修改队列时,线程获取锁。 When waiting for something to be dequeued, consumer threads would wait on a semaphore which is incremented by enqueuers.当等待某些东西出队时,消费者线程将等待一个信号量,该信号量由入队者递增。 It's also a good idea to implement some mechanism to shut down the consumer thread.实现一些机制来关闭消费者线程也是一个好主意。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM