简体   繁体   English

C#通过同一类的多个实例处理一堆数据

[英]C# Processing bunch of data by multiple instances of same class

We have web application running in Azure cloud as worker role C#.NET. 我们在Azure云中以工作者角色C#.NET运行Web应用程序。 Part of this application is deciphering lot of short strings in different objects (about 2000 per one request). 此应用程序的一部分是解密许多不同对象中的短字符串(每个请求大约2000个)。 We want to make it as fast as possible and we need to manage multi threading in correct way. 我们希望使其尽可能快,并且我们需要以正确的方式管理多线程。

We had it implemented in way that each object created new instance of class, new thread to decrypt that string. 我们以每个对象创建类的新实例,解密该字符串的新线程的方式实现了它。 But it took too much time. 但是花了太多时间。

If we construct only one instance of class and run all data trough it, it is much faster, instead constructing it over and over again for each object. 如果我们仅构造一个类的实例并通过它运行所有数据,则它要快得多,而不是为每个对象一遍又一遍地构造它。

Question is, how to improve that. 问题是,如何改进它。 We want to do something like this, but have no idea how: 我们想做这样的事情,但不知道如何做:

  1. We want to create pool of data to be decrypted 我们要创建要解密的数据池
  2. Create multiple instances of same class (ideally one per CPU core) as one thread per core. 创建多个相同类的实例(每个CPU内核一个实例),每个内核一个线程。
  3. Have some sort of mechanism to feed those instances with data 有某种机制可以为这些实例提供数据
  4. When pool is empty, close all threads. 当池为空时,关闭所有线程。

We don't want to start new thread for every object in pool, but have limited number of threads running in parallel feeding on same list of data and processing it one by one. 我们不想为池中的每个对象启动新线程,但是并行运行的线程数量有限,它们只能馈送相同的数据列表并逐一处理它。


UPDATE 1: 更新1:

We have tried approaches that was mentioned in comments, especially Storage Queue and Web Job, but due to structure of our code, significant changes would have been necessary to implement with uncertain result. 我们已经尝试了注释中提到的方法,尤其是Storage Queue和Web Job,但是由于我们的代码结构,必须进行重大更改才能实现不确定的结果。 SO it wasn't way to go. 所以这不是要走的路。

In the end we did following and I will share results at the end: 最后,我们进行了以下操作,最后我将分享结果:

We are creating 12 instances of "decrypt-or" - deciphering instances with AES 256. Number 12 is only top value and in reality only 4 - 6 instances are created based on load. 我们正在创建12个“解密或”实例-使用AES 256解密实例。数字12仅是最高值,实际上,仅基于负载创建4-6个实例。 Closing of instance is done when main Queue is depleted. 当主队列耗尽时,将关闭实例。

All object that have to be deciphered are in Queue and every instance of "decrypt-or" have its own imaginary Queue. 所有需要解密的对象都在Queue中,每个“ decrypt-or”实例都有其自己的假想Queue。 So we are processing object from main Queue and searching for "decrypt-or" with 0 objects in imaginary queue or one with lowest count of objects. 因此,我们正在处理来自主队列的对象,并在虚拟队列中搜索0个对象或对象数最少的对象进行“解密或”操作。

Results 结果

Get all method which took the most time and it's our reference: 获取花费最多时间的所有方法,这是我们的参考:

  1. Original implementation: 6.39 seconds / CPU Load 100% on 16 cores 原始实施:6.39秒/ 16个内核上的CPU负载100%
  2. Implementation with one instance of "decrypt-or": 1.62 seconds / CPU load 50 - 60% on 16 cores 使用一个“解密或”实例的实现:1.62秒/ 16个内核上的CPU负载50-60%
  3. 12 instances of "decrypt-or": 1.27 seconds / CPU load 20 - 25 % on 16 cores “解密或”的12个实例:1.27秒/ 16个内核上的CPU负载20-25%

As you can see, we were able to decrease time by 21% compared to single instance implementation but more we reduce CPU usage so we will try to reduce cores without compromising speed. 如您所见,与单实例实现相比,我们能够将时间减少21%,但是更多的是,我们减少了CPU使用率,因此我们将尝试在不影响速度的情况下减少内核。

Next step will be bigger performance tests, to see what are the limits of this approach. 下一步将是更大的性能测试,以了解这种方法的局限性。

Answer is in update: 答案是最新的:

We are creating 12 instances of "decrypt-or" - deciphering instances with AES 256. Number 12 is only top value and in reality only 4 - 6 instances are created based on load. 我们正在创建12个“解密或”实例-使用AES 256解密实例。数字12仅是最高值,实际上,仅基于负载创建4-6个实例。 Closing of instance is done when main Queue is depleted. 当主队列耗尽时,将关闭实例。

All object that have to be deciphered are in Queue and every instance of "decrypt-or" have its own imaginary Queue. 所有需要解密的对象都在Queue中,每个“ decrypt-or”实例都有其自己的假想Queue。 So we are processing object from main Queue and searching for "decrypt-or" with 0 objects in imaginary queue or one with lowest count of objects. 因此,我们正在处理来自主队列的对象,并在虚拟队列中搜索0个对象或对象数最少的对象进行“解密或”操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM