I have to setup camel to process data files where the first line of the file is the metadata and then it follows with millions of lines of actual data. The metadata dictates how the data is to be processed. What I am looking for is something like this:
Read first line (metadata) and populate a bean (with metadata) --> 2. then send data 1000 lines at a time to the data processor which will refer to the bean in step # 1
Is it possible in Apache Camel?
Yes.
An example architecture might look something like this:
Since you have a large quantity of data to deal with, it will likely be difficult to pass around 1000 line groups of data through messages, especially if they are being placed in a queue (you don't want to run out of memory). I recommend passing around some type of indicator that can be used to identify which line of the file a specific request was for, and then parse the 1000 lines when you need them. You could do this in a number of ways, like by calculating the number of bytes deep into a file a specific line is, and then using a file reader's skip() method to jump to that line when the request hits the bean that will be processing it.
Here are some resources provided on the Apache Camel website that describe the enterprise integration patterns that I mentioned above:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.