简体   繁体   English

在.NET中使用TPL

[英]Using TPL in .NET

I have to refactor a fairly time-consuming process in one of my applications and after doing some research I think it's a perfect match for using TPL. 我必须在我的一个应用程序中重构一个相当耗时的过程,并且在进行了一些研究之后,我认为这是使用TPL的完美选择。 I wanted to clarify my understanding of it and ask if there are any more issues which I should take into account. 我想澄清一下我的理解,并询问是否还有更多我应该考虑的问题。

In few words, I have a windows service, which runs overnight and sends out emails with data updates to around 10000 users. 简而言之,我有一个Windows服务,该服务可以在夜间运行,并向大约10000个用户发送包含数据更新的电子邮件。 At presence, the whole process takes around 8 hrs to complete. 在场时,整个过程大约需要8个小时才能完成。 I would like to reduce it to 2 hrs max. 我想将其减少到最大2小时。

Application workflow follows steps below: 1. Iterate through all users list 2. Check if this user has to be notified 3. If so, create an email body by calling external service 4. Send an email 应用程序的工作流程如下:1.遍历所有用户列表2.检查是否必须通知该用户3.如果需要,则通过调用外部服务来创建电子邮件正文4.发送电子邮件

Analysis of the code has shown that step 3 is the most time-consuming one and takes around 3,5 sec to complete. 对代码的分析表明,步骤3是最耗时的步骤,大约需要3.5秒才能完成。 It means, that when processing 10000 users, my application waits well over 6 hrs in total for a response from the external service! 这意味着,当处理10000个用户时,我的应用程序总共要等待6个小时以上,以等待外部服务的响应! I think this is a reason good enough to try to introduce some asynchronous and parallel processing. 我认为这是一个很好的理由,可以尝试引入一些异步和并行处理。

So, my plan is to use Parallel class and ForEach method to iterate through users in step 1. As I can understand this should distribute processing each user into a separate thread, making them run in parallel? 因此,我的计划是在步骤1中使用Parallel类和ForEach方法遍历用户。据我所知,这应该将每个用户的处理分散到一个单独的线程中,以使其并行运行? Processes are completely independent of each other and each doesn't return any value. 进程彼此完全独立,并且不返回任何值。 In the case of any exception being thrown it will be persisted in logs db. 如果发生任何异常,它将保留在日志数据库中。 As with regards to step 3, I would like to convert a call to external service into an async call. 至于第3步,我想将对外部服务的调用转换为异步调用。 As I can understand this would release the resources on the thread so it could be reused by the Parallel class to start processing next user from the list? 据我了解,这将释放线程上的资源,以便Parallel类可以重用它来开始处理列表中的下一个用户?

I had a read through MS documentation regarding TPL, especially Potential Pitfalls in Data and Task Parallelism document and the only point I'm not sure about is "Avoid Writing to Shared Memory Locations". 我通读了有关TPL的MS文档,尤其是数据和任务并行性文档中的潜在陷阱,我不确定的唯一一点是“避免写入共享内存位置”。 I am using a local integer to count a total number of emails processed. 我正在使用一个本地整数来计算已处理的电子邮件总数。 As with regards to all of the rest, I'm quite positive they're not applicable to my scenario. 至于所有其他内容,我非常肯定它们不适用于我的情况。

My question is, without any implementation as yet. 我的问题是,尚无任何实现。 Is what I'm trying to achieve possible (especially the async await part for external service call)? 我正在尝试实现的目标(尤其是外部服务调用的异步等待部分)是否可行? Should I be aware of any other obstacles that might affect my implementation? 我是否应该意识到其他可能影响实施的障碍? Is there any better way of improving the workflow? 有没有更好的方法来改善工作流程?

Just to clarify I'm using .Net v4.0 只是为了澄清我正在使用.Net v4.0

Yes, you can use the TPL for your problem. 是的,您可以使用TPL解决问题。 If you cannot influence your external problem, then this might be the best way. 如果您不能影响外部问题,那么这可能是最好的方法。

However, you can make the best gains if you can get your external source to accept batches. 但是,如果您可以使外部来源接受批次,则可以最大程度地获利。 Because this source could actually optimize the performance. 因为此源实际上可以优化性能。 Right now you have a message overhead of 10000 messages to serialize, send, work on, receive and deserialize. 现在,您需要10000条消息的消息开销来进行序列化,发送,处理,接收和反序列化。 This is stuff that could be done once . 这是东西,可以做一次 In addition, your external source might be able to optimize the work they do if they know they will get multiple records. 此外,如果外部源知道他们将获得多条记录,则他们也许可以优化他们的工作。

So the bottom line is: if you need to optimize locally, the TPL is fine. 因此,底线是:如果需要本地优化,则TPL很好。 If you want to optimize your whole process for actual gains, try to find out if your external source can help you, because that is where you can make some real progress. 如果您想优化整个过程以获得实际收益,请尝试找出外部资源是否可以帮助您,因为这是您可以取得真正进展的地方。

You didn't show any code, and I'm assuming that step 4 ( send an e-mail ) is not that fast either. 您没有显示任何代码,我假设第4步( 发送电子邮件 )也不是那么快。

With the presented case, unless your external service from step 3 ( create an email body by calling external service ) processes requests in parallel and supports a good load of simultaneous requests, you will not gain much with this refactor. 在这种情况下,除非您的第3步中的外部服务( 通过调用外部服务创建电子邮件正文 )并行处理请求并支持大量同时请求,否则使用此重构将不会带来太大收益。

In other words, test the external service and the e-mail server first for: 换句话说,首先对外部服务和电子邮件服务器进行以下测试:

  • Parallel request execution 并行请求执行

    The way to test this is to send at least 2 simultaneous requests and observe how long it takes to process them. 测试此方法的方法是发送至少两个同时的请求,并观察处理它们所需的时间。

    If it takes about double the time of a single, the requests have some serial processing, either they're queued or some broad lock is being taken. 如果花费的时间是单个时间的两倍,则请求将进行一些串行处理,既可以将它们排队,也可以进行一些广泛的锁定。

  • Load test 负载测试

    Go up to 4, 8, 12, 16, 20, etc, and see where it starts to degrade. 上升到4、8、12、16、20等,然后查看它开始退化的位置。

    You should set a limit on the amount of simultaneous requests to something that keeps execution time above eg 80% of the time it takes to process a single request, assuming you're the sole consumer 您应该对同时执行的请求数量设置一个限制,以使执行时间超过例如执行单个请求所需时间的80%(假设您是唯一的使用者)

    Or a few requests before it starts degrading (eg divide by the number of consumers) to leave the external service available for other consumers. 或者在它开始降级之前有几个请求(例如,除以消费者数量),以使外部服务可用于其他消费者。

Only then can you decide if the refactor is worth. 只有这样,您才能决定重构是否值得。 If you can't change the external service or the e-mail server, you must weight it they offer enough parallel capability without degrading. 如果您无法更改外部服务或电子邮件服务器,则必须对其进行加权,以使其具有足够的并行功能而不会降低性能。

Even so, be realistic. 即使这样,也要现实一点。 Don't let your service push the external service and the e-mail server to their limits in production. 不要让您的服务将外部服务和电子邮件服务器推向生产极限。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM