简体   繁体   English

为什么线程可能被认为是“邪恶的”?

[英]Why might threads be considered "evil"?

I was reading theSQLite FAQ , and came upon this passage:我正在阅读SQLite FAQ ,并发现了这段话:

Threads are evil. 线程是邪恶的。 Avoid them.避开它们。

I don't quite understand the statement "Thread are evil".我不太明白“线程是邪恶的”这句话。 If that is true, then what is the alternative?如果这是真的,那么替代方案是什么?

My superficial understanding of threads is:我对线程的粗浅理解是:

  • Threads make concurrence happen.线程使并发发生。 Otherwise, the CPU horsepower will be wasted, waiting for (eg) slow I/O.否则,CPU 马力将被浪费,等待(例如)慢速 I/O。
  • But the bad thing is that you must synchronize your logic to avoid contention and you have to protect shared resources.但坏处是您必须同步您的逻辑以避免争用,并且您必须保护共享资源。

Note: As I am not familiar with threads on Windows, I hope the discussion will be limited to Linux/Unix threads.注意:由于我不熟悉 Windows 上的线程,我希望讨论仅限于 Linux/Unix 线程。

When people say that "threads are evil", the usually do so in the context of saying "processes are good".当人们说“线程是邪恶的”时,通常是在说“进程是好的”的上下文中这样做的。 Threads implicitly share all application state and handles (and thread locals are opt-in).线程隐式地共享所有应用程序状态和句柄(并且线程局部变量是可选的)。 This means that there are plenty of opportunities to forget to synchronize (or not even understand that you need to synchronize!) while accessing that shared data.这意味着在访问共享数据时有很多机会忘记同步(或者甚至不明白您需要同步!)。

Processes have separate memory space, and any communication between them is explicit.进程有独立的内存空间,它们之间的任何通信都是显式的。 Furthermore, primitives used for interprocess communication are often such that you don't need to synchronize at all (eg pipes).此外,用于进程间通信的原语通常根本不需要同步(例如管道)。 And you can still share state directly if you need to, using shared memory, but that is also explicit in every given instance.如果需要,您仍然可以使用共享内存直接共享状态,但这在每个给定实例中也是明确的。 So there are fewer opportunities to make mistakes, and the intent of the code is more explicit.所以犯错的机会更少,代码的意图更明确。

Simple answer the way I understand it...以我理解的方式简单回答...

Most threading models use "shared state concurrency," which means that two execution processes can share the same memory at the same time.大多数线程模型使用“共享状态并发”,这意味着两个执行进程可以同时共享相同的内存。 If one thread doesn't know what the other is doing, it can modify the data in a way that the other thread doesn't expect.如果一个线程不知道另一个线程在做什么,它可以以另一个线程不期望的方式修改数据。 This causes bugs.这会导致错误。

Threads are "evil" because you need to wrap your mind around n threads all working on the same memory at the same time, and all of the fun things that go with it (deadlocks, racing conditions, etc).线程是“邪恶的”,因为您需要将注意力集中在同时处理同一内存的n线程上,以及随之而来的所有有趣的事情(死锁、赛车条件等)。

You might read up about the Clojure (immutable data structures) and Erlang (message passsing) concurrency models for alternative ideas on how to achieve similar ends.您可能会阅读 Clojure(不可变数据结构)和 Erlang(消息传递)并发模型,以获取有关如何实现类似目标的替代想法。

What makes threads "evil" is that once you introduce more than one stream of execution into your program, you can no longer count on your program to behave in a deterministic manner.线程“邪恶”的原因在于,一旦将多个执行流引入程序,就不能再指望程序以确定性方式运行。

That is to say: Given the same set of inputs, a single-threaded program will (in most cases) always do the same thing.也就是说:给定相同的输入集,单线程程序(在大多数情况下)总是做同样的事情。

A multi-threaded program, given the same set of inputs, may well do something different every time it is run, unless it is very carefully controlled.多线程程序,给定相同的一组输入,每次运行时很可能会做不同的事情,除非它被非常小心地控制。 That is because the order in which the different threads run different bits of code is determined by the OS's thread scheduler combined with a system timer, and this introduces a good deal of "randomness" into what the program does when it runs.这是因为不同线程运行不同位代码的顺序是由操作系统的线程调度程序与系统计时器相结合决定的,这为程序运行时的行为引入了大量“随机性”。

The upshot is: debugging a multi-threaded program can be much harder than debugging a single-threaded program, because if you don't know what you are doing it can be very easy to end up with a race condition or deadlock bug that only appears (seemingly) at random once or twice a month.结果是:调试多线程程序比调试单线程程序要困难得多,因为如果您不知道自己在做什么,很容易以竞争条件或死锁错误告终。每月(似乎)随机出现一次或两次。 The program will look fine to your QA department (since they don't have a month to run it) but once it's out in the field, you'll be hearing from customers that the program crashed, and nobody can reproduce the crash.... bleah.该程序对您的 QA 部门来说看起来不错(因为他们没有一个月的时间来运行它)但是一旦它在现场使用,您就会从客户那里听到程序崩溃了,没有人可以重现崩溃.. .. 废话。

To sum up, threads aren't really "evil", but they are strong juju and should not be used unless (a) you really need them and (b) you know what you are getting yourself into.总而言之,线程并不是真正的“邪恶”,但它们是强大的 juju,不应使用,除非 (a) 您确实需要它们并且 (b) 您知道自己在做什么。 If you do use them, use them as sparingly as possible, and try to make their behavior as stupid-simple as you possibly can.如果您确实使用它们,请尽可能少地使用它们,并尽可能使它们的行为变得愚蠢而简单。 Especially with multithreading, if anything can go wrong, it (sooner or later) will.特别是对于多线程,如果出现任何问题,它(迟早)会出错。

I would interpret it another way.我会用另一种方式解释它。 It's not that threads are evil, it's that side-effects are evil in a multithreaded context (which is a lot less catchy to say).并不是说线程是邪恶的,而是多线程上下文中的副作用是邪恶的(这不太好说)。

A side effect in this context is something that affects state shared by more than one thread, be it global or just shared.这种情况下的副作用是影响多个线程共享的状态,无论是全局的还是仅共享的。 I recently wrote a review of Spring Batch and one of the code snippets used is:我最近写了一篇关于 Spring Batch评论,其中使用的代码片段之一是:

private static Map<Long, JobExecution> executionsById = TransactionAwareProxyFactory.createTransactionalMap();
private static long currentId = 0;

public void saveJobExecution(JobExecution jobExecution) {
  Assert.isTrue(jobExecution.getId() == null);
  Long newId = currentId++;
  jobExecution.setId(newId);
  jobExecution.incrementVersion();
  executionsById.put(newId, copy(jobExecution));
}

Now there are at least three serious threading issues in less than 10 lines of code here.现在在这里不到 10 行代码中至少存在三个严重的线程问题。 An example of a side effect in this context would be updating the currentId static variable.这种情况下的副作用的一个例子是更新 currentId 静态变量。

Functional programming (Haskell, Scheme, Ocaml, Lisp, others) tend to espouse "pure" functions.函数式编程(Haskell、Scheme、Ocaml、Lisp 等)倾向于支持“纯”函数。 A pure function is one with no side effects.纯函数是一种没有副作用的函数。 Many imperative languages (eg Java, C#) also encourage the use of immutable objects (an immutable object is one whose state cannot change once created).许多命令式语言(例如 Java、C#)也鼓励使用不可变对象(不可变对象是一种一旦创建其状态就不能改变的对象)。

The reason for (or at least the effect of) both of these things is largely the same: they make multithreaded code much easier.这两件事的原因(或至少是效果)大致相同:它们使多线程代码容易。 A pure function by definition is threadsafe.根据定义,纯函数是线程安全的。 An immutable object by definition is threadsafe.根据定义,不可变对象是线程安全的。

The advantage processes have is that there is less shared state (generally).进程的优势在于共享状态较少(通常)。 In traditional UNIX C programming, doing a fork() to create a new process would result in shared process state and this was used as a means of IPC (inter-process communication) but generally that state is replaced (with exec()) with something else.在传统的 UNIX C 编程中,执行 fork() 来创建一个新进程会导致共享进程状态,这被用作 IPC(进程间通信)的一种手段,但通常该状态被替换(使用 exec())别的东西。

But threads are much cheaper to create and destroy and they take less system resources (in fact, the operating itself may have no concept of threads yet you can still create multithreaded programs).但是线程的创建和销毁成本低得多,而且它们占用的系统资源更少(实际上,操作本身可能没有线程的概念,但您仍然可以创建多线程程序)。 These are called green threads .这些被称为绿线

The paper you linked to seems to explain itself very well.您链接的论文似乎很好地解释了自己。 Did you read it?你读过它吗?

Keep in mind that a thread can refer to the programming-language construct (as in most procedural or OOP languages, you create a thread manually, and tell it to executed a function), or they can refer to the hardware construct (Each CPU core executes one thread at a time).请记住,线程可以引用编程语言结构(就像在大多数过程或 OOP 语言中一样,您手动创建一个线程,并告诉它执行一个函数),或者它们可以引用硬件结构(每个 CPU 内核一次执行一个线程)。

The hardware-level thread is obviously unavoidable, it's just how the CPU works.硬件级线程显然是不可避免的,这就是CPU的工作原理。 But the CPU doesn't care how the concurrency is expressed in your source code.但是 CPU 并不关心您的源代码中如何表达并发性。 It doesn't have to be by a "beginthread" function call, for example.例如,它不必通过“beginthread”函数调用。 The OS and the CPU just have to be told which instruction threads should be executed.操作系统和 CPU 只需要被告知应该执行哪些指令线程。

His point is that if we used better languages than C or Java with a programming model designed for concurrency, we could get concurrency basically for free.他的观点是,如果我们使用比 C 或 Java 更好的语言以及为并发设计的编程模型,我们基本上可以免费获得并发。 If we'd used a message-passing language, or a functional one with no side-effects, the compiler would be able to parallelize our code for us.如果我们使用一种消息传递语言,或者一种没有副作用的函数式语言,编译器将能够为我们并行化我们的代码。 And it would work.它会起作用。

Threads aren't any more "evil" than hammers or screwdrivers or any other tools;螺纹并不比锤子、螺丝刀或任何其他工具更“邪恶”; they just require skill to utilize.他们只需要技巧来使用。 The solution isn't to avoid them;解决方案不是避免它们; it's to educate yourself and up your skill set.这是为了教育自己并提高你的技能。

Creating a lot of threads without constraint is indeed evil.. using a pooling mechanisme (threadpool) will mitigate this problem.创建大量没有约束的线程确实是邪恶的……使用池化机制(线程池)将缓解这个问题。

Another way threads are 'evil' is that most framework code is not designed to deal with multiple threads, so you have to manage your own locking mechanisme for those datastructures.线程“邪恶”的另一种方式是,大多数框架代码并不是为处理多线程而设计的,因此您必须为这些数据结构管理自己的锁定机制。

Threads are good, but you have to think about how and when you use them and remember to measure if there really is a performance benefit.线程很好,但您必须考虑如何以及何时使用它们,并记住衡量是否真的有性能优势。

A thread is a bit like a light weight process.线程有点像轻量级进程。 Think of it as an independent path of execution within an application.将其视为应用程序内的独立执行路径。 The thread runs in the same memory space as the application and therefore has access to all the same resources, global objects and global variables.线程在与应用程序相同的内存空间中运行,因此可以访问所有相同的资源、全局对象和全局变量。

The good thing about them: you can parallelise a program to improve performance.它们的好处是:您可以并行化程序以提高性能。 Some examples, 1) In an image editing program a thread may run the filter processing independently of the GUI.一些示例,1) 在图像编辑程序中,线程可以独立于 GUI 运行过滤器处理。 2) Some algorithms lend themselves to multiple threads. 2) 一些算法适用于多线程。

Whats bad about them?他们有什么不好? if a program is poorly designed they can lead to deadlock issues where both threads are waiting on each other to access the same resource.如果程序设计不当,它们可能会导致死锁问题,即两个线程都在等待对方访问相同的资源。 And secondly, program design can me more complex because of this.其次,程序设计可能因此变得更加复杂。 Also, some class libraries don't support threading.此外,一些类库不支持线程。 eg the c library function "strtok" is not "thread safe".例如,c 库函数“strtok”不是“线程安全的”。 In other words, if two threads were to use it at the same time they would clobber each others results.换句话说,如果两个线程同时使用它,它们会破坏彼此的结果。 Fortunately, there are often thread safe alternatives... eg boost library.幸运的是,通常有线程安全的替代方案……例如 boost 库。

Threads are not evil, they can be very useful indeed.线程并不邪恶,它们确实非常有用。

Under Linux/Unix, threading hasn't been well supported in the past although I believe Linux now has Posix thread support and other unices support threading now via libraries or natively.在 Linux/Unix 下,线程在过去没有得到很好的支持,尽管我相信 Linux 现在有 Posix 线程支持,其他 unice 现在通过库或本机支持线程。 ie pthreads.即线程。

The most common alternative to threading under Linux/Unix platforms is fork.在 Linux/Unix 平台下最常见的线程替代方法是 fork。 Fork is simply a copy of a program including it's open file handles and global variables. Fork 只是一个程序的副本,包括它的打开文件句柄和全局变量。 fork() returns 0 to the child process and the process id to the parent. fork() 向子进程返回 0,向父进程返回进程 ID。 It's an older way of doing things under Linux/Unix but still well used.这是在 Linux/Unix 下做事的一种较旧的方式,但仍然很好用。 Threads use less memory than fork and are quicker to start up.线程比 fork 使用更少的内存并且启动更快。 Also, inter process communications is more work than simple threads.此外,进程间通信比简单的线程更多的工作。

In a simple sense you can think of a thread as another instruction pointer in the current process.简单来说,您可以将线程视为当前进程中的另一个指令指针。 In other words it points the IP of another processor to some code in the same executable.换句话说,它将另一个处理器的 IP 指向同一可执行文件中的某些代码。 So instead of having one instruction pointer moving through the code there are two or more IP's executing instructions from the same executable and address space simultaneously.因此,不是让一个指令指针在代码中移动,而是同时从同一可执行文件和地址空间执行两个或多个 IP 指令。

Remember the executable has it's own address space with data / stack etc... So now that two or more instructions are being executed simultaneously you can imagine what happens when more than one of the instructions wants to read/write to the same memory address at the same time.记住可执行文件有它自己的地址空间和数据/堆栈等......所以现在同时执行两条或更多条指令,你可以想象当不止一条指令想要读/写同一内存地址时会发生什么同时。

The catch is that threads are operating within the process address space and are not afforded protection mechanisms from the processor that full blown processes are.问题是线程在进程地址空间内运行,并且没有像成熟的进程那样从处理器那里获得保护机制。 (Forking a process on UNIX is standard practice and simply creates another process.) (在 UNIX 上分叉进程是标准做法,只是创建另一个进程。)

Out of control threads can consume CPU cycles, chew up RAM, cause execeptions etc.. etc.. and the only way to stop them is to tell the OS process scheduler to forcibly terminate the thread by nullifying it's instruction pointer (ie stop executing).失控的线程会消耗 CPU 周期、占用内存、导致异常等等,而停止它们的唯一方法是告诉操作系统进程调度程序通过取消其指令指针(即停止执行)来强制终止线程. If you forcibly tell a CPU to stop executing a sequence of instructions what happens to the resources that have been allocated or are being operated on by those instructions?如果您强行告诉 CPU 停止执行一系列指令,那么这些指令已分配或正在操作的资源会发生什么情况? Are they left in a stable state?他们是否处于稳定状态? Are they properly freed?他们被适当地释放了吗? etc...等等...

So, yes, threads require more thought and responsibility than executing a process because of the shared resources.因此,是的,由于共享资源,线程比执行进程需要更多的思考和责任。

For any application that requires stable and secure execution for long periods of time without failure or maintenance, threads are always a tempting mistake.对于任何需要长时间稳定和安全执行而不会出现故障或维护的应用程序,线程总是一个诱人的错误。 They invariably turn out to be more trouble than they are worth.事实证明,它们总是比它们的价值更麻烦。 They produce rapid results and prototypes that seem to be performing correctly but after a couple weeks or months running you discover that they have critical flaws.它们产生快速的结果和原型,这些原型似乎运行正常,但在运行几周或几个月后,您会发现它们存在严重缺陷。

As mentioned by another poster, once you use even a single thread in your program you have now opened a non-deterministic path of code execution that can produce an almost infinite number of conflicts in timing, memory sharing and race conditions.正如另一张海报所提到的,一旦你在你的程序中使用了一个线程,你现在就已经打开了一条非确定性的代码执行路径,它可能会在时间、内存共享和竞争条件方面产生几乎无限数量的冲突。 Most expressions of confidence in solving these problems are expressed by people who have learned the principles of multithreaded programming but have yet to experience the difficulties in solving them.大多数表达对解决这些问题的信心的人都是已经学习了多线程编程原理但尚未经历解决这些问题的困难的人。

Threads are evil.线程是邪恶的。 Good programmers avoid them wherever humanly possible.优秀的程序员会尽可能地避免它们。 The alternative of forking was offered here and it is often a good strategy for many applications.这里提供了分叉的替代方案,这对于许多应用程序来说通常是一个很好的策略。 The notion of breaking your code down into separate execution processes which run with some form of loose coupling often turns out to be an excellent strategy on platforms that support it.将代码分解为以某种形式松散耦合运行的单独执行进程的概念通常在支持它的平台上被证明是一种极好的策略。 Threads running together in a single program is not a solution.在单个程序中一起运行的线程不是解决方案。 It is usually the creation of a fatal architectural flaw in your design that can only be truly remedied by rewriting the entire program.通常是在您的设计中创建了一个致命的架构缺陷,只有通过重写整个程序才能真正补救。

The recent drift towards event oriented concurrency is an excellent development innovation.最近转向面向事件并发的趋势是一项极好的开发创新。 These kinds of programs usually prove to have great endurance after they are deployed.这些类型的程序在部署后通常被证明具有很强的耐用性。

I've never met a young engineer who didn't think threads were great.我从来没有遇到过认为线程不是很好的年轻工程师。 I've never met an older engineer who didn't shun them like the plague.我从未见过一位年长的工程师像躲避瘟疫一样避开它们。

Being an older engineer, I heartily agree with the answer by Texas Arcane .作为一名年长的工程师,我非常同意Texas Arcane回答

Threads are very evil because they cause bugs that are extremely difficult to solve.线程非常邪恶,因为它们会导致极难解决的错误。 I have literally spent months solving sporadic race-conditions.我真的花了几个月的时间来解决零星的竞争条件。 One example caused trams to suddenly stop about once a month in the middle of the road and block traffic until towed away.一个例子导致有轨电车大约每个月突然停在路中间一次,阻塞交通,直到被拖走。 Luckily I didn't create the bug, but I did get to spend 4 months full-time to solve it...幸运的是我没有创建错误,但我确实花了 4 个月的全职时间来解决它......

It's a tad late to add to this thread, but I would like to mention a very interesting alternative to threads: asynchronous programming with co-routines and event loops.添加到这个线程有点晚了,但我想提到一个非常有趣的线程替代方案:具有协同例程和事件循环的异步编程。 This is being supported by more and more languages, and does not have the problem of race conditions like multi-threading has.这被越来越多的语言支持,并且没有多线程那样的竞争条件问题。

It can replace multi-threading in cases where it is used to wait on events from multiple sources, but not where calculations need to be performed in parallel on multiple CPU cores.在用于等待来自多个源的事件的情况下,它可以代替多线程,但不能在需要在多个 CPU 内核上并行执行计算的情况下。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么操作系统线程被认为是昂贵的? - Why are OS threads considered expensive? 为什么CPU使用率和线程数会达到99-100%? - Why might CPU Usages and Threads be goes to 99-100%? JVM在并行处理方面有多好? 我什么时候应该创建自己的Threads和Runnables? 为什么线程会干扰? - How good is the JVM at parallel processing? When should I create my own Threads and Runnables? Why might threads interfere? 为什么对象对所有线程都可见,而读取线程可能没有及时看到另一个线程写入的值? - Why are objects visible to all threads, while a reading thread might not see a value written by another thread on a timely basis? Node.js 是否被视为带有工作线程的多线程? - Is Node.js considered multithreading with worker threads? Java等待多个线程,这可能会创建新的线程 - Java Waiting on multiple Threads, that might create new Threads 为什么CompareAndSwap指令被认为是昂贵的? - Why is CompareAndSwap instruction considered expensive? 为什么多线程环境被认为是有害的? - Why is a multithreaded environment considered harmful? 在服务器应用程序的情况下,将线程分解为良好,中性或差的设计 - In the case of a server application, is detatching threads considered good, neutral, or poor design 在 rust 中读取具有多个线程的文件是否被视为未定义行为? - Is reading from a file with multiple threads considered undefined behavior in rust?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM