简体   繁体   English

Java中具有并行化的链接过滤器

[英]Chained filters with parallelization in Java

I am trying to parse an input into an output by applying a sequence of transformations t0, t1, ..., tn . 我试图通过应用一系列转换t0, t1, ..., tninput解析为output The execution will be a chain: 执行将是一个链条:

input -> t0 -> t1 -> ... -> tn -> output

Some of the transformations should be parallelized, to prevent becoming a bottleneck. 一些转换应该并行化,以防止成为瓶颈。

Is there any framework for creating such a process chain in Java? 是否有用于在Java中创建此类流程链的框架? I know how I could do it manually (eg Queuing jobs in a processing chain in Java ) but I am specifically looking for a framework, since 我知道如何手动执行操作(例如Java处理链中的排队作业 ),但我特别在寻找框架,因为

  • The complexity of a longer (partially) multithreaded chain with caches, waits, ... can quickly become complex 带有缓存,等待等的较长(部分)多线程链的复杂性可能很快变得复杂
  • I have several such chains and want to avoid reinventing what might be covered in a standard library 我有几个这样的链,并且想要避免重新发明标准库中可能包含的内容

I believe the fork/join framework (in jdk 7) is what you are looking for. 我相信fork / join框架 (在jdk 7中)是您想要的。 it enables specifying task dependencies and splitting jobs into more parallel chunks depending on available resources. 它使您能够指定任务依赖性,并根据可用资源将作业拆分为更多并行块。

There exist many different execution models to express such a transformation chain. 存在许多不同的执行模型来表达这种转换链。 Transformations can be threads with loops reading from input queues, or objects with a method to handle each incoming message (threads and loops hidden under the hood). 转换可以是具有从输入队列读取的循环的线程,也可以是具有处理每个传入消息的方法的对象(隐藏在内部的线程和循环)。 Transformation may have single input or several inputs. 转换可以具有单个输入或多个输入。 It may know its successors and send them results directly, or just return a value from method, and a separately described topology takes care of further message routing. 它可能知道其后继者并直接向其发送结果,或者只是从方法中返回一个值,而单独描述的拓扑将负责进一步的消息路由。 Availability of space to hold results may or may not be taken into account when a transformation act is started, and so on. 当开始执行转换操作时,可以考虑或可以不考虑保留结果的空间的可用性,依此类推。 A transformation act may be very short, so overhead for message passing is significant, and message passing should be carefully optimized (see Disruptor), or ordinary linked queue generating wrapper objects for each message would suffice. 转换动作可能非常短,因此消息传递的开销非常大,并且应该仔细优化消息传递(请参阅Disruptor),或者为每个消息生成包装对象的普通链接队列就足够了。

It is said, it is hard to give an advice without knowing your requirements. 据说,在不知道您的要求的情况下很难提供建议。 You should explore which models and implementations exist and find out most appropriate for your case. 您应该探索存在哪些模型和实现,并找出最适合您的案例的模型和实现。 Without knowing details, advices like "I believe framework XYZ is what you are looking for" only describe advisor as a person who hardly knows other frameworks besides XYZ. 在不知道细节的情况下,诸如“我相信XYZ框架就是您要寻找的东西”之类的建议仅将顾问描述为除了XYZ之外几乎不了解其他框架的人。 Nevertheless, I dare to recommend you to look at Dataflow Framework for Java , whose model is reach enough, though has its limitations (eg execution nodes may not block). 尽管如此,我还是建议您看一下Dataflow Framework for Java ,尽管其模型有局限性(例如,执行节点可能不会阻塞),但其模型已经够用了。

As Alexei said about advice, let the buyer beware. 正如阿列克谢(Alexei)所说的建议一样,请买家当心。 I just happen to maintain a Fork/Join framework on SourceForge that supports filters. 我只是碰巧在SourceForge上维护了一个支持过滤器的Fork / Join框架。 Since the code is free and open-source, you can use it as is or do what you want. 由于该代码是免费的并且是开源的,因此您可以按原样使用它,也可以做自己想做的事情。 TymeacDSE 泰美克

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM