简体   繁体   中英

Large CSV with Apache Camel + Aggregator

Generate big files with Apache Camel with Aggregator with better performance as this post: Large Files with Apache Camel

My body has 33352 rows.. and using completionSize="1000" and completionTimeout="2500" the final file missed the last 352 rows

<camel:split streaming="true">
  <camel:simple>${body}</camel:simple>
  <camel:marshal>
    <camel:csv quote='"' quoteDisabled="false" headerDisabled="true" />
  </camel:marshal>
  <camel:aggregate strategyRef="setfepCsvStringBodyAggregator" completionSize="1000" completionTimeout="2500">
    <camel:correlationExpression>
      <constant>true</constant>
    </camel:correlationExpression>
    <to uri="file:{{setfep_dir_inprogress}}/?fileName={{setfep_filename_clientes}}.txt&amp;fileExist=Append" />
  </camel:aggregate>
</camel:split>

Final file has 33000 rows missing 352.

[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000
[INFO ] org.apache.camel.util.CamelLogger.log - Complete by=size rows=1000

if I use completionSize="1000" and completionInterval="2500" my final file has 33155 hows missing 197 rows.

[Camel thread #0 - AggregateTimeoutChecker] [INFO ]  CamelLogger.log - Complete by=interval rows=566
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[Camel thread #0 - AggregateTimeoutChecker] [INFO ]  CamelLogger.log - Complete by=interval rows=43
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[Camel thread #0 - AggregateTimeoutChecker] [INFO ]  CamelLogger.log - Complete by=interval rows=401
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[Camel thread #0 - AggregateTimeoutChecker] [INFO ]  CamelLogger.log - Complete by=interval rows=768
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[Camel thread #0 - AggregateTimeoutChecker] [INFO ]  CamelLogger.log - Complete by=interval rows=377
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000
[main] [INFO ]  CamelLogger.log - Complete by=size rows=1000

How to fix this?

Apache Camel 2.19.0

gotcha!

I running as unit test... the context ends before complete completionInterval or completionTimeout

when I put a delay in my route I could see the total 33352 rows wrote.

<camel:delay>
  <constant>5000</constant>
</camel:delay>

But in production isn't necessary, the context continue alive or we can use the option forceCompletionOnStop :

<camel:aggregate strategyRef="setfepCsvStringBodyAggregator" forceCompletionOnStop="true" completionSize="1000" completionInterval="4000">

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM