简体   繁体   English

我应该在Java程序中使用多少个线程?

[英]How many threads should I use in my Java program?

I recently inherited a small Java program that takes information from a large database, does some processing and produces a detailed image regarding the information. 我最近继承了一个小型Java程序,它从大型数据库中获取信息,进行一些处理并生成有关信息的详细图像。 The original author wrote the code using a single thread, then later modified it to allow it to use multiple threads. 原作者使用单个线程编写代码,然后对其进行修改以允许它使用多个线程。

In the code he defines a constant; 在代码中他定义了一个常量;

//  number of threads
public static final int THREADS =  Runtime.getRuntime().availableProcessors();

Which then sets the number of threads that are used to create the image. 然后设置用于创建映像的线程数。

I understand his reasoning that the number of threads cannot be greater than the number of available processors, so set it the the amount to get the full potential out of the processor(s). 我理解他的理由是线程数不能大于可用处理器的数量,因此将其设置为从处理器中获取全部潜力的数量。 Is this correct? 它是否正确? or is there a better way to utilize the full potential of the processor(s)? 或者是否有更好的方法来充分利用处理器的潜力?

EDIT: To give some more clarification, The specific algorithm that is being threaded scales to the resolution of the picture being created, (1 thread per pixel). 编辑:为了进一步澄清,正在线程化的特定算法会扩展到正在创建的图片的分辨率(每个像素1个线程)。 That is obviously not the best solution though. 这显然不是最好的解决方案。 The work that this algorithm does is what takes all the time, and is wholly mathematical operations, there are no locks or other factors that will cause any given thread to sleep. 该算法所做的工作是一直需要的工作,并且是完全数学运算,没有锁或其他因素会导致任何给定的线程休眠。 I just want to maximize the programs CPU utilization to decrease the time to completion. 我只想最大化程序CPU利用率,以减少完成时间。

Threads are fine, but as others have noted, you have to be highly aware of your bottlenecks. 线程很好,但正如其他人所说,你必须高度意识到你的瓶颈。 Your algorithm sounds like it would be susceptible to cache contention between multiple CPUs - this is particularly nasty because it has the potential to hit the performance of all of your threads (normally you think of using multiple threads to continue processing while waiting for slow or high latency IO operations). 您的算法听起来很容易受到多个CPU之间的缓存争用的影响 - 这尤其令人讨厌,因为它有可能达到所有线程的性能(通常您会想到使用多个线程在等待缓慢或高速时继续处理延迟IO操作)。

Cache contention is a very important aspect of using multi CPUs to process a highly parallelized algorithm: Make sure that you take your memory utilization into account. 缓存争用是使用多CPU处理高度并行化算法的一个非常重要的方面:确保考虑到内存利用率。 If you can construct your data objects so each thread has it's own memory that it is working on, you can greatly reduce cache contention between the CPUs. 如果您可以构造数据对象,以便每个线程都有自己正在处理的内存,那么可以大大减少CPU之间的缓存争用。 For example, it may be easier to have a big array of ints and have different threads working on different parts of that array - but in Java, the bounds checks on that array are going to be trying to access the same address in memory, which can cause a given CPU to have to reload data from L2 or L3 cache. 例如,拥有大量的int并使不同的线程处理该阵列的不同部分可能更容易 - 但在Java中,对该阵列的边界检查将尝试访问内存中的相同地址,可能导致给定的CPU必须从L2或L3缓存重新加载数据。

Splitting the data into it's own data structures, and configure those data structures so they are thread local (might even be more optimal to use ThreadLocal - that actually uses constructs in the OS that provide guarantees that the CPU can use to optimize cache. 将数据拆分为自己的数据结构,并配置这些数据结构,使它们是线程本地的(甚至可能更优化使用ThreadLocal - 实际上使用OS中的结构,提供CPU可用于优化缓存的保证。

The best piece of advice I can give you is test, test, test. 我能给你的最好建议是测试,测试,测试。 Don't make assumptions about how CPUs will perform - there is a huge amount of magic going on in CPUs these days, often with counterintuitive results. 不要对CPU的将如何执行的假设-有神奇的CPU的事情这几天,常与直观的结果,数额巨大 Note also that the JIT runtime optimization will add an additional layer of complexity here (maybe good, maybe not). 另请注意,JIT运行时优化将在此处添加额外的复杂层(可能很好,可能不是)。

On the one hand, you'd like to think Threads == CPU/Cores makes perfect sense. 一方面,你想要认为Threads == CPU / Cores非常有意义。 Why have a thread if there's nothing to run it? 为什么有一个线程,如果没有什么可以运行它?

The detail boils down to "what are the threads doing". 细节归结为“线程在做什么”。 A thread that's idle waiting for a network packet or a disk block is CPU time wasted. 空闲等待网络数据包或磁盘块的线程浪费了CPU时间。

If your threads are CPU heavy, then a 1:1 correlation makes some sense. 如果您的线程CPU很重,那么1:1的相关性就有一定意义了。 If you have a single "read the DB" thread that feeds the other threads, and a single "Dump the data" thread and pulls data from the CPU threads and create output, those two could most likely easily share a CPU while the CPU heavy threads keep churning away. 如果你有一个“读取数据库”线程,它提供其他线程,并且单个“转储数据”线程并从CPU线程中提取数据并创建输出,那么这两个很可能很容易共享CPU而CPU重线程继续搅拌。

The real answer, as with all sorts of things, is to measure it. 与各种各样的事情一样,真正的答案就是衡量它。 Since the number is configurable (apparently), configure it! 由于该数字是可配置的(显然),请配置它! Run it with 1:1 threads to CPUs, 2:1, 1.5:1, whatever, and time the results. 用1:1线程运行它到CPU,2:1,1.5:1,无论如何,并为结果计时。 Fast one wins. 快一胜。

The number that your application needs; 您的应用程序需要的数量; no more, and no less. 不多也不少。

Obviously, if you're writing an application which contains some parallelisable algorithm, then you can probably start benchmarking to find a good balance in the number of threads, but bear in mind that hundreds of threads won't speed up any operation. 显然,如果你正在编写一个包含一些可并行算法的应用程序,那么你可以开始基准测试以找到线程数量的良好平衡,但请记住,数百个线程不会加速任何操作。

If your algorithm can't be parallelised, then no number of additional threads is going to help. 如果您的算法无法并行化,那么没有多少额外的线程可以提供帮助。

Yes, that's a perfectly reasonable approach. 是的,这是一种非常合理的方法。 One thread per processor/core will maximize processing power and minimize context switching. 每个处理器/核心一个线程将最大化处理能力并最小化上下文切换。 I'd probably leave that as-is unless I found a problem via benchmarking/profiling. 除非我通过基准测试/分析发现问题,否则我可能会保留原样。

One thing to note is that the JVM does not guarantee availableProcessors() will be constant, so technically, you should check it immediately before spawning your threads. 需要注意的一点是,JVM不保证availableProcessors()将是常量,因此从技术上讲,您应该在生成线程之前立即检查它。 I doubt that this value is likely to change at runtime on typical computers, though. 我怀疑这个值在典型的计算机上可能会在运行时发生变化。

PS As others have pointed out, if your process is not CPU-bound, this approach is unlikely to be optimal. PS正如其他人所指出的那样,如果你的进程不受CPU限制,那么这种方法不太可能是最优的。 Since you say these threads are being used to generate images, though, I assume you are CPU bound. 既然你说这些线程用于生成图像,我认为你 CPU绑定的。

number of processors is a good start; 处理器数量是一个良好的开端; but if those threads do a lot of i/o, then might be better with more... or less. 但如果这些线程做了很多i / o,那么可能会更好......或更少。

first think of what are the resources available and what do you want to optimise (least time to finish, least impact to other tasks, etc). 首先想一想可用的资源是什么,你想要优化什么(完成时间最短,对其他任务的影响最小等)。 then do the math. 然后做数学。

sometimes it could be better if you dedicate a thread or two to each i/o resource, and the others fight for CPU. 有时候,如果你为每个i / o资源专门设置一个或两个线程,那么其他人就可以争取更好的CPU。 the analisys is usually easier on these designs. 在这些设计中,分析通常更容易。

The benefit of using threads is to reduce wall-clock execution time of your program by allowing your program to work on a different part of the job while another part is waiting for something to happen (usually I/O). 使用线程的好处是通过允许程序在作业的不同部分工作而另一部分正在等待某些事情发生(通常是I / O)来减少程序的挂钟执行时间。 If your program is totally CPU bound adding threads will only slow it down. 如果你的程序完全是CPU绑定的,那么添加线程只会降低它的速度。 If it is fully or partially I/O bound, adding threads may help but there's a balance point to be struck between the overhead of adding threads and the additional work that will get accomplished. 如果它是完全或部分I / O绑定,添加线程可能有所帮助,但是在添加线程的开销和将要完成的额外工作之间存在一个平衡点。 To make the number of threads equal to the number of processors will yield peak performance if the program is totally, or near-totally CPU-bound. 如果程序完全或几乎完全受CPU限制,那么使线程数等于处理器数将产生峰值性能。

As with many questions with the word "should" in them, the answer is, "It depends". 正如许多关于“应该”这个词的问题一样,答案是“它取决于”。 If you think you can get better performance, adjust the number of threads up or down and benchmark the application's performance. 如果您认为可以获得更好的性能,请向上或向下调整线程数并对应用程序的性能进行基准测试。 Also take into account any other factors that might influence the decision (if your application is eating 100% of the computer's available horsepower, the performance of other applications will be reduced). 还要考虑可能影响决策的任何其他因素(如果您的应用程序正在吃掉100%的计算机可用马力,其他应用程序的性能将会降低)。

This assumes that the multi-threaded code is written properly etc. If the original developer only had one CPU, he would never have had a chance to experience problems with poorly-written threading code. 这假设多线程代码是正确编写的。如果原始开发人员只有一个CPU,他将永远不会有机会遇到写得不好的线程代码问题。 So you should probably test behaviour as well as performance when adjusting the number of threads. 因此,在调整线程数时,您应该测试行为和性能。

By the way, you might want to consider allowing the number of threads to be configured at run time instead of compile time to make this whole process easier. 顺便说一下,您可能需要考虑允许在运行时配置线程数而不是编译时间,以使整个过程更容易。

After seeing your edit, it's quite possible that one thread per CPU is as good as it gets. 看到你的编辑之后,每个CPU的一个线程很可能和它一样好。 Your application seems quite parallelizable. 您的应用程序似乎可以并行化。 If you have extra hardware you can use GridGain to grid-enable your app and have it run on multiple machines. 如果您有额外的硬件,您可以使用GridGain为您的应用程序启用网格,并让它在多台计算机上运行。 That's probably about the only thing, beyond buying faster / more cores, that will speed it up. 除了购买更快/更多核心之外,这可能是唯一能够加快速度的因素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM