简体   繁体   English

应该何时将任务视为“长时间运行”?

[英]When should a task be considered “long running”?

When working with tasks, a rule of thumb appears to be that the thread pool - typically used by eg invoking Task.Run() , or Parallel.Invoke() - should be used for relatively short operations. 在处理任务时,经验法则似乎是线程池 - 通常由例如调用Task.Run()Parallel.Invoke() - 应该用于相对较短的操作。 When working with long running operations, we are supposed to use the TaskCreationOptions.LongRunning flag in order to - as far as I understand it - avoid clogging the thread pool queue, ie to push work to a newly-created thread. 在处理长时间运行的操作时,我们应该使用TaskCreationOptions.LongRunning标志,以便 - 据我所知 - 避免堵塞线程池队列,即将工作推送到新创建的线程。

But what exactly is a long running operation? 但究竟什么是长期运行 How long is long, in terms of time? 从时间上看,多长时间? Are there other factors besides the expected task duration to be considered when deciding whether or not to use the LongRunning , like the anticipated CPU architecture (frequency, the number of cores, ...) or the number of tasks that will be attempted to be run at once from the programmer's perspective? 在决定是否使用LongRunning ,是否还有其他因素需要考虑,例如预期的CPU架构(频率,核心数......)或尝试的任务数量从程序员的角度立刻运行?

For example, suppose I have 500 tasks to process in a dedicated application, each taking 10-20 seconds to complete. 例如,假设我有500个任务要在专用应用程序中处理,每个任务需要10-20秒才能完成。 Should I just start all 500 tasks using Task.Run (eg in a loop) and then await them all, perhaps as LongRunning , while leaving the default max level of concurrency? 我应该只使用Task.Run启动所有500个任务(例如在循环中),然后等待它们全部,可能是LongRunning ,同时保留默认的最大并发级别? Then again, if I set LongRunning in such case, wouldn't this create 500 new threads and actually cause a lot of overhead and higher memory usage (due to extra threads being allocated) as compared to omitting LongRunning ? 然后,如果我在这种情况下设置LongRunning ,那么与省略LongRunning相比,这不会创建500个新线程并且实际上会导致大量开销和更高的内存使用(由于额外的线程被分配)? This is assuming that no new tasks will be scheduled for execution while these 500 are being awaited. 这假设在等待这500个任务时不会安排执行新任务。

I would guess that the decision to set LongRunning depends on the number of requests made to the thread pool in a given time interval, and that LongRunning should only be used for tasks that are expected to take significantly longer that the majority of the thread pool-placed tasks - by definition, at most a small percentage of all tasks. 我猜想设置LongRunning的决定取决于在给定时间间隔内对线程池发出的请求数,而LongRunning只应该用于预期比大多数线程池花费更长时间的任务 -放置任务 - 根据定义,最多只占所有任务的一小部分。 In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. 换句话说,这似乎是一个排队和线程池利用率优化问题,应该通过测试逐个解决,如果有的话。 Am I correct? 我对么?

It kind of doesn't matter. 这有点无关紧要。 The problem isn't really about time, it's about what your code is doing. 问题不在于时间,而在于你的代码在做什么。 If you're doing asynchronous I/O, you're only using the thread for the short amount of time between individual requests. 如果您正在进行异步I / O,那么您仅在单个请求之间的短时间内使用该线程。 If you're doing CPU work... well, you're using the CPU. 如果你在做CPU工作......好吧,你正在使用CPU。 There's no "thread-pool starvation", because the CPUs are fully utilized. 没有“线程池饥饿”,因为CPU被充分利用。

The real problem is when you're doing blocking work that doesn't use the CPU. 真正的问题是当你正在进行使用CPU的阻塞工作时。 In case like that, thread-pool starvation leads to CPU-underutilization - you said "I need the CPU for my work" and then you don't actually use it. 在这种情况下,线程池饥饿导致CPU利用不足 - 你说“我需要CPU来完成我的工作”然后你实际上并没有使用它。

If you're not using blocking APIs, there's no point in using Task.Run with LongRunning . 如果你不使用阻塞的API,有在不采用点Task.RunLongRunning If you have to run some legacy blocking code asynchronously, using LongRunning may be a good idea. 如果必须异步运行一些旧的阻塞代码,使用LongRunning可能是个好主意。 Total work time isn't as important as "how often you are doing this". 总工作时间不如“你经常这样做”的重要性。 If you spin up one thread based on a user clicking on a GUI, the cost is tiny compared to all the latencies already included in the act of clicking a button in the first place, and you can use LongRunning just fine to avoid the thread-pool. 如果根据用户点击GUI启动一个线程,与首先单击按钮时已包含的所有延迟相比,成本很小,并且您可以使用LongRunning来避免线程 -池。 If you're running a loop that spawns lots of blocking tasks... stop doing that. 如果你正在运行一个产生大量阻塞任务的循环......那就停止这样做了。 It's a bad idea :D 这是一个坏主意:D

For example, imagine there is no asynchronous API alternative File.Exists . 例如,假设没有异步API替代File.Exists So if you see that this is giving you trouble (eg over a faulty network connection), you'd fire it up using Task.Run - and since you're not doing CPU work, you'd use LongRunning . 因此,如果您发现这会给您带来麻烦(例如,通过错误的网络连接),您可以使用Task.Run启动它 - 并且由于您没有进行CPU工作,因此您将使用LongRunning

In contrast, if you need to do some image manipulation that's basically 100% CPU work, it doesn't matter how long the operation takes - it's not a LongRunning thing. 相比之下,如果你需要做一些基本上100%CPU工作的图像处理,那么操作需要多长时间并不重要 - 它不是LongRunning东西。

And finally, the most common scenario for using LongRunning is when your "work" is actually the old-school "loop and periodically check if something should be done, do it and then loop again". 最后,使用LongRunning最常见的情况是当你的“工作”实际上是老派“循环并定期检查是否应该完成某些事情,然后再进行循环”。 Long running, but 99% of the time just blocking on some wait handle or something like that. 长时间运行,但99%的时间只是阻止一些等待句柄或类似的东西。 Again, this is only useful when dealing with code that isn't CPU-bound, but that doesn't have proper asynchronous APIs. 同样,这仅在处理不受CPU限制但没有适当异步API的代码时才有用。 You might find something like this if you ever need to write your own SynchronizationContext , for example. 例如,如果您需要编写自己的SynchronizationContext ,可能会发现类似的内容。

Now, how do we apply this to your example? 现在,我们如何将此应用于您的示例? Well, we can't, not without more information. 好吧,我们不能,不是没有更多的信息。 If your code is CPU-bound, Parallel.For and friends are what you want - those ensure you only use enough threads to sature the CPUs, and it's fine to use the thread-pool for that. 如果你的代码是CPU绑定的,那么Parallel.For和friends就是你想要的 - 那些确保你只使用足够的线程来保证CPU的安全,并且可以使用线程池。 If it's not CPU bound... you don't really have any option besides using LongRunning if you want to run the tasks in parallel. 如果它不受 CPU限制...如果你想并行运行任务,除了使用LongRunning之外你没有任何选择。 Ideally, such work would consist of asynchronous calls you can safely invoke and await Task.WhenAll(...) from your own thread. 理想情况下,这样的工作将包括您可以安全地调用的异步调用,并await Task.WhenAll(...)来自您自己的线程的await Task.WhenAll(...)

When working with tasks, a rule of thumb appears to be that the thread pool - typically used by eg invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. 在处理任务时,经验法则似乎是线程池 - 通常由例如调用Task.Run()或Parallel.Invoke()使用 - 应该用于相对较短的操作。 When working with long running operations, we are supposed to set the TaskCreationOptions.LongRunning to true in order to - as far as I understand it - avoid clogging the thread pool queue, ie to push work to a newly-created thread. 当处理长时间运行的操作时,我们应该将TaskCreationOptions.LongRunning设置为true,以便 - 据我所知 - 避免堵塞线程池队列,即将工作推送到新创建的线程。

The vast majority of the time, you don't need to use LongRunning at all, because the thread pool will adjust to "losing" a thread to a long-running operation after 2 seconds. 绝大多数情况下,您根本不需要使用LongRunning ,因为线程池将在2秒后调整为“丢失”线程为长时间运行的操作。

The main problem with LongRunning is that it forces you to use the very dangerous StartNew API . LongRunning的主要问题是它会强制您使用非常危险的StartNew API

In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. 换句话说,这似乎是一个排队和线程池利用率优化问题,应该通过测试逐个解决,如果有的话。 Am I correct? 我对么?

Yes. 是。 You should never set LongRunning when first writing code. 首次编写代码时,不应该设置LongRunning If you are seeing delays due to the thread pool injection rate, then you can carefully add LongRunning . 如果您看到由于线程池注入率导致的延迟,那么您可以小心地添加LongRunning

You should not use TaskCreationOptions.LongRunning in your case. 您不应该在您的情况下使用TaskCreationOptions.LongRunning I would use Parallel.For . 我会使用Parallel.For

The LongRunning option is not to be used if you're going to create a lot of tasks, just like in your case. 如果您要创建大量任务,则不会使用LongRunning选项,就像您的情况一样。 It is to be used for creating couple of tasks that will be running for a Long Time. 它将用于创建将要运行很长时间的几个任务。

By the way, i never used this option in any similar scenario. 顺便说一下,我从来没有在任何类似的场景中使用过这个选项。

As you point out, TaskCreationOptions.LongRunning 's purpose is 正如你所指出的, TaskCreationOptions.LongRunning的目的是

to allow the ThreadPool to continue to process work items even though one task is running for an extended period of time 允许ThreadPool继续处理工作项,即使一个任务正在运行很长一段时间

As for when to use it: 至于何时使用它:

It's not a specific length per se...You'd typically only use LongRunning if you found through performance testing that not using it was causing long delays in the processing of other work. 它本身并不是一个特定的长度...如果你通过性能测试发现不使用它会导致其他工作的处理长时间延迟,你通常只使用LongRunning。

Source 资源

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM