简体   繁体   English

Apache Camel以递增和周期性方式从ftp中提取数据

[英]Apache Camel pull data from ftp incrementally & periodically

I am very new to Apache camel and I am exploring how to create a rout which pulls data from ftp for instance each 15 minutes and pulls only new or updated files, so if some files were downloaded early and still the same (unchanged) ftp loader should not load them to the destination folder. 我是Apache骆驼的新手,我正在探索如何创建一个从ftp中提取数据的路径,例如每15分钟一次,并且只提取新的或更新的文件,所以如果一些文件早期下载并且仍然是相同的(未更改的)ftp加载器不应将它们加载到目标文件夹。

Any advices are warmly appreciated. 任何建议都热烈赞赏。

UPDATE #1 更新#1

I've already noticed that I need to look at the FTP2, and actually I've already made a progress, the last thing that I want to clarify: consumer.dealy defines delay between each download attempt, for instance consumer.delay = 5s and at the first attempt ftp contains 5 files, consumer pulls data to somewhere and waites 5s at the second attempt ftp still the same and camel just does nothing, after that to ftp arrives additional 5 files and after 5 seconds ftp consumer downloads these just arrived new files or consumer.delay just makes consumer wait between each download of file (file#1 -> 5s -> file#2 -> 5s -> etc...) 我已经注意到我需要查看FTP2,实际上我已经取得了进展,我想澄清的最后一件事:consumer.dealy定义了每次下载尝试之间的延迟,例如consumer.delay = 5s并且在第一次尝试时ftp包含5个文件,消费者将数据拉到某处并在第二次尝试时等待5s ftp仍然是相同的并且骆驼什么都不做,之后到ftp到达额外的5个文件并且在5秒之后ftp消费者下载这些刚刚到达新文件或consumer.delay只会让消费者在每次下载文件之间等待(文件#1 - > 5s - >文件#2 - > 5s - >等...)

I want to achieve first scenario. 我想实现第一个场景。

Also, I observed that once some files were downloaded to the destination folder, I mean from ftp to local file system, this files will be ignored in subsequent data loads, even if this files were deleted on the local file system, how I can tell to camel to download again deleted files, how it stores information about already loaded files? 另外,我观察到一旦某些文件被下载到目标文件夹,我的意思是从ftp到本地文件系统,这些文件将在后续数据加载时被忽略,即使这些文件在本地文件系统上被删除,我怎么知道骆驼再次下载已删除的文件,它如何存储已加载文件的信息? And it seems that it downloads all files each time even files were downloaded at first data pull. 并且它似乎每次下载所有文件,甚至文件在第一次数据拉取下载。 Do I need to write a filter to exclude already downloaded files? 我是否需要编写过滤器来排除已下载的文件?

there is FTP component for apache camel http://camel.apache.org/ftp.html apache camel http://camel.apache.org/ftp.html有FTP组件

use "consumer.delay" property to pull data for delay in milliseconds between each poll. 使用“consumer.delay”属性来提取每个轮询之间的延迟数据(以毫秒为单位)。

for implementation details look here http://architects.dzone.com/articles/apache-camel-integration 有关实施细节,请访问http://architects.dzone.com/articles/apache-camel-integration

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM