Apache骆驼分离器，线程池和JMS

Question

I have defined the following route in spring xml to split rows in a text file and send each row to a JMS queue 我已经在spring xml中定义了以下路由，以拆分文本文件中的行并将每一行发送到JMS队列

<bean id="myPool" class="java.util.concurrent.Executors" factory-method="newCachedThreadPool"/>

<camelContext id="concurrent-route-context" xmlns="http://camel.apache.org/schema/spring" trace="true">
    <route id="inbox-threadpool-split-route">
        <from uri="{{inbox.uri}}" />
            <log message="Starting to process file: ${header.CamelFileName}" />
            <split streaming="true" executorServiceRef="myPool">
                <tokenize token="\n" />
                <to uri="{{inventory.queue.uri}}" />    
            </split>
            <log message="Done processing file: ${header.CamelFileName}" />
    </route>
</camelContext>

inbox.uri is a file component uri listening for file in a directory, while inventory.queue.uri is a JmsComponent uri that connecting to a queue in JMS server(Tibco EMS 6.X version). inbox.uri是侦听目录中文件的文件组件uri，而stock.queue.uri是连接到JMS服务器（Tibco EMS 6.X版）中的队列的JmsComponent uri。 The JmsComponent uri is simple like "JmsComponent:queue:?username=&password=" JmsComponent uri很简单，例如“ JmsComponent：queue：？username =＆password =“

The above route can be run without error, but the rows splitted from file are not sent to the queue as JMS message (ie the queue is still empty after the program has run) 上面的路由可以正确运行，但是从文件分割的行不会作为JMS消息发送到队列（即，程序运行后队列仍然为空）

If I remove the executorServiceRef="myPool" from the splitter definition (remaining definition like ), the splitted messages can be delivered to JMS queue one by one. 如果我从拆分器定义（保留定义，如）中删除executorServiceRef =“ myPool”，则拆分后的消息可以一一传递到JMS队列。

If I replace the "to" uri with a "direct" endpoint, then splitted messages can be delivered no matter threadpool is used in splitter or not 如果我将“ to” uri替换为“ direct”端点，那么无论线程池是否在拆分器中使用，拆分的消息都可以传递

Is there any special setting required in JmsComponent in order to make it works with Splitter + threadpool? 为了使它与Splitter +线程池一起使用，JmsComponent中是否需要任何特殊设置？ or any other configurations I have missed? 还是我错过的任何其他配置？

======= Edit on 20150731 ======= =======编辑20150731 =======

I was suffering from the above issue when testing with a Big CSV file with 1000 rows. 使用1000行大CSV文件进行测试时，我遇到了上述问题。 If I test with a small file (eg 10 rows only), I can see that the messages are delivered to inventory.queue, but from the log it seems that it takes 10 seconds to complete the splitting and deliver the messages to queue... Below captured the log: 如果我使用一个小文件（例如，仅10行）进行测试，则可以看到消息已传递到venture.queue，但是从日志来看，似乎需要10秒才能完成拆分并将消息传递到队列。。下面捕获了日志：

2015-07-31 11:02:07,210 [main           ] INFO  SpringCamelContext             - Apache Camel 2.15.0 (CamelContext: concurrent-route-context) started in 1.301 seconds
2015-07-31 11:02:07,220 [main           ] INFO  MainSupport                    - Apache Camel 2.15.0 starting
2015-07-31 11:02:17,250 [://target/inbox] INFO  inbox-threadpool-split-route   - Done processing file: smallfile.csv

see the route started at 11:02:07 and show the "Done processing..." statement at 11:02:17, ie 10 seconds 请参阅从11:02:07开始的路线，并在11:02:17（即10秒）显示“完成处理...”语句

If I test again with a CSV with 5 rows, it will take 5 seconds... It seems like it takes 1 second per row for splitting and deliver to JMS queue... which is very slow 如果我再次使用具有5行的CSV进行测试，则将花费5秒钟...似乎每行花费1秒进行拆分并传送到JMS队列...这非常慢

If I change the "to uri" to "direct" instead of "JMS", the splitting can be completed very fast within a second 如果我将“ to uri”更改为“ direct”而不是“ JMS”，则可以在一秒钟之内快速完成拆分

Also, from the JMS listener log, it was able to receive all 10 messages in same second. 而且，从JMS侦听器日志中，它能够在同一秒内接收所有10条消息。 It seems like the Splitter will read and split the whole file, "prepare" the 10 JMS messages for all ten rows, and then deliver all the messages to queue afterward, but not "split 1 row and deliver 1 JMS message immediately"... 似乎Splitter将读取并拆分整个文件，为所有十行“准备” 10条JMS消息，然后将所有消息传递到队列中，而不是“拆分1行并立即传递1条JMS消息”。。

Is there any options or configurations that could change the Splitter behavior and enhance the split performance? 是否有任何选项或配置可以改变Splitter的行为并增强拆分性能？

Answer 1

I had similar issue while processing 14G file using splitter with tokenizing. 在使用分割器和标记化处理14G文件时，我遇到了类似的问题。 I was able to overcome the performance hump by using Aggregator as pointed by Claus's post on Parsing Large Files with Apache Camel 正如克劳斯在使用Apache Camel解析大型文件中的文章所指出的，我能够使用Aggregator来克服性能问题

After aggregating batch messages, I used producer template to route those messages to messaging system. 汇总批消息后，我使用生产者模板将这些消息路由到消息传递系统。 Hope that helps. 希望能有所帮助。

Answer 2

Thanks for the reference link shared by @Aayush Tuladhar, I have updated my route as follows: 感谢@Aayush Tuladhar共享的参考链接，我将路线更新如下：

<camelContext id="concurrent-route-context" xmlns="http://camel.apache.org/schema/spring" trace="false" >
    <route id="inbox-threadpool-split-route">
        <from uri="{{inbox.uri}}" />
            <log message="Starting to process file: ${header.CamelFileName}" />
            <split streaming="true" executorServiceRef="myPool">
                <tokenize token="\n" />
                <log message="split index - $simple{property.CamelSplitIndex}, row content=$simple{body}" />
                <aggregate strategyRef="stringBodyAggregator"  completionInterval="750"  >
                    <correlationExpression>
                        <simple>property.CamelSplitIndex</simple>
                    </correlationExpression>
                    <to uri="{{inventory.queue.uri}}" />
                </aggregate>
            </split>
            <log message="Done processing file: ${header.CamelFileName}" />
    </route>
</camelContext>

The trick here is that an aggregator was added within the splitter, which used 这里的窍门是在拆分器中添加了一个聚合器，

property.CamelSplitIndex

as the correlationExpression. 作为correlationExpression。 CamelSplitIndex keeps incrementing for each splitted row, so the aggregator didn't actually "aggregating" anything, but ends the "aggregation" and enqueue JMS message to JMS queue immediately. CamelSplitIndex会为每个拆分的行不断增加，因此聚合器实际上并没有“聚合”任何东西，而是结束了“聚合”并立即将JMS消息排队到JMS队列中。 The aggregationStrategy simply joins oldExchange and newExchange, but it is not important here, as it is just used for fulfilling the required attribute "strategyRef" for aggregate EIP AggregationStrategy只是将oldExchange和newExchange连接在一起，但是在这里并不重要，因为它仅用于满足聚合EIP所需的属性“ strategyRef”

One point to note is that after using this trick, the performance bottleneck shifted to the JMS message producer, which was delivering 1 message per second... I solved this issue by leveraging the CachingConnectionFactory to define the JMS connection in Spring. 需要注意的一点是，使用此技巧后，性能瓶颈转移到了JMS消息生成器，后者每秒发送1条消息...我通过利用CachingConnectionFactory在Spring中定义JMS连接来解决了这个问题。

Apache骆驼分离器，线程池和JMS

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-07-31 05:41:36

解决方案2
0 2015-07-31 16:50:42

Apache骆驼分离器，线程池和JMS

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-07-31 05:41:36

解决方案2 0 2015-07-31 16:50:42

解决方案1
1 已采纳 2015-07-31 05:41:36

解决方案2
0 2015-07-31 16:50:42